vignette study - Improving patient experience in primary care: a multimethod programme of resea

P

arts of this chapter are based on Burtet al.190_{under the terms of the Creative Commons Attribution}

license (CC BY 4.0), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Abstract

Background

Although minority ethnic groups have consistently reported poorer care in patient surveys, it is not known whether this is because they receive worse care or because they respond differently to such surveys. Methods

We conducted an experimental vignette study to investigate whether or not South Asian people rate simulated GP consultations differently from white British people. In total, 564 white British and 564 Pakistani adults were recruited using an in-home face-to-face approach. Trained fieldworkers completed computer-assisted personal interviews during which participants rated the communication within three

video recordings of simulated GP–patient consultations. Consultations were shown in a random order,

selected from a pool of 16. Mean differences in communication scores (on a scale of 0–100) between

white British and Pakistani patients were estimated from linear regression. Results

Pakistani participants, on average, scored consultations 9.8 points higher than white British participants

(95% CI 8.0 to 11.7 points;p<0.001) when viewing the same consultations. When adjusted for age,

gender, deprivation, self-rated health and video, the difference increased to 11.0 points (95% CI 8.5 to

13.6 points;p<0.001). The largest differences were seen in older participants (≥55 years) and when

communication was scripted to be poor. Conclusions

Substantial differences in ratings were found, with Pakistani respondents giving higher scores to videos showing the same care. If we take these findings at face value, they would suggest that the lower scores reported by Pakistani patients in national surveys such as the GP Patient Survey represent genuinely worse care.

Introduction and rationale for the study

As outlined inChapter 5, some minority ethnic groups have reported consistently lower patient experience

scores than the majority population in both the UK and the USA.75,150–153_{Of particular concern within}

the UK, and confirmed by the analyses undertaken for this programme grant, South Asian groups

report significantly more negative experiences of GP–patient communication than their white British

counterparts.131,156_{Potential explanations for these lower ratings focus on whether South Asian patients}

A number of potential drivers of more negative ratings of similar standards of care exist. For example, it

has been suggested that differences in the use of questionnaire response scales (e.g. Elliottet al.80_{) may}

lead to South Asian groups being less likely to endorse the most positive options when asked to evaluate a

doctor’s communication skills. Our analysis of GP Patient Survey data, drawing on item response theory to

explore whether or not items receive systematically different responses from South Asian and white British

groups, suggested that this was unlikely to be the case.159_{Yet there are also other, alternative drivers of}

poorer ratings of similar care, most notably that the evaluation of consultations by South Asian respondents is influenced by systematic variations in their expectations of, or preferences for, care. Fundamentally, these concerns centre on a well-recognised and long-standing problem with surveys: that

individuals may interpret and respond to the‘same’question in many different ways.191_{Potential solutions}

to this problem arose first within the field of political science, where the use of standardised scenarios,

or vignettes, was proposed to evaluate the disparity in responses to survey items.82_{Such approaches are}

particularly relevant to understanding minority ethnic experiences: as already described, alongside potential variations in scale use by individuals from various ethnic backgrounds, we also need to consider systematic cultural variations in expectations of or preferences for care, as well as the potential for systematic

variations in actual experience. A recent US study81_{adopted King}_{et al.}_’_{s vignette methodology to examine}

the extent of cross-cultural incomparability in survey responses, using predominantly written vignettes. This online survey concluded that score variations observed on national surveys among African American, Latino and white respondents were likely to reflect true differences in real-life experiences, at least for

items in the survey that used an‘always to never’response scale.81

The aim of this strand of work was to build on previous vignette approaches to examine whether or not people from a Pakistani background rate the communication within simulated GP consultations differently from white British people. If these groups rate simulated consultations similarly when viewing identical video vignettes, then we would be able to conclude that it is more likely that the lower scores previously reported by South Asian respondents in national patient experience surveys reflect real differences in quality of communication within consultations.

Changes to study methods from the original protocol

This strand of work, as stated in the original protocol, formed part of our wider aim of exploring in more detail the experiences of minority ethnic groups, together with the GP Patient Survey analyses reported in Chapter 5: to understand the reasons why minority ethnic groups, especially South Asians, give lower scores on patient surveys than the white British population (aim 5).

In our original protocol, to undertake this study we envisaged developing a DVD containing short clips

(3–4 minutes) of four simulated patient consultations and asking respondents to rate these using the

GP–patient communication items of the GP Patient Survey. These DVDs would be sent out, with

questionnaires and instructions, to patients registered with practices with a high proportion of South Asian

patients. We suggested using SANGRA (South Asian names and group recognition algorithm)192_{to identify}

South Asian patients. In practice, we first devised a more robust and efficient approach to recruiting participants, using targeted face-to-face recruitment in partnership with the market research agency, Ipsos MORI. This enabled us to effectively reach a rigorously sampled set of participants of known Pakistani ethnicity. Second, participants rated simulated consultations during face-to-face computer-assisted

interviews conducted by trained fieldworkers. This enabled us to collect high-quality and consistent ratings

of consultations. Our recruitment and rating approach is detailed in full inMethods.

As we acknowledged in our original protocol, the requirement of the vignettes approach to show identical consultations to all participants meant that all videos had to be in English. However, we had stated that, although we would therefore have to exclude patients who could not understand English, we would make study questionnaires and documentation available in four Asian languages. As we employed face-to-face

EXPERIMENTAL VIGNETTE STUDY

NIHR Journals Library www.journalslibrary.nihr.ac.uk

computer-assisted interviews in the study, this requirement was no longer necessary once we had screened for those who were confident in their ability in spoken English. This therefore represents a further

improvement on our original study design.

Methods

In this experimental vignette study we showed videos of simulated GP–patient consultations to white

British and Pakistani respondents, who were asked to rate the quality of the communication within each consultation that they viewed. The study advisory group was particularly involved in consideration of the nature of the vignettes to be shown and the study materials.

Simulated consultations

To ensure generalisability and to avoid the chance inclusion of a characteristic or event that, unknown to us, might systematically be rated differently by the two participating ethnic groups, we produced a series of 16 vignettes. We set out to manipulate the vignettes on three key domains:

1. the presenting complaint depicted within each consultation

2. the quality of the communication within each consultation (poor or good)

3. the ethnic background of the actors playing the doctor and patient (South Asian or white British). Published recommendations for the production of vignettes emphasise the importance of developing a

valid script and considering how best to manipulate this on the domains of interest.193_{We therefore based}

our vignettes on real-life consultations that were video recorded as part of another workstream (the

association between patients’, raters’and GPs’assessments of communication in a consultation, for which

we recorded>500 real-life consultations). We undertook an extensive process of script development,

role playing and rating prior to filming the vignettes with professional actors (Figure 14).

The vignettes that we produced covered four different clinical scenarios: persistent cough, perforated ear drum, painful elbow and generalised numbness. We developed two different scripts for each clinical scenario: one designed to illustrate poor communication by the doctor and one designed to illustrate good

communication. We formulated‘poor’and‘good’standards of communication according to the GCRS.126_This

observer-rated measure of communication competence (derived from the widely used Calgary–Cambridge

guide to the medical interview127,128_{) was developed as part of our workstream on patients}_’_{and raters}_’

assessments of communication competence within a consultation. The GCRS instrument covers 12 domains including initiating the session, gathering information, building the relationship and achieving a shared

understanding (seeAppendix 1for the full instrument). We then used both the‘poor’and the‘good’

version of the four clinical scenarios to film two sets of vignettes. The first set of vignettes had white British actors playing the GP and the patient, whereas the second repeated the same scripts but with South Asian actors playing the GP and the patient. The GP role was acted throughout by either one white British or one South Asian actor; eight different actors (four white British and four South Asian) role-played patients, each participating in one clinical scenario. The final 16 videos were each scored by three trained clinical raters

using the GCRS to assess communication quality in relation to professionally defined norms.126_{Mean GCRS}

scores for the‘poor’communication vignettes ranged from 0.6 to 2.4 (out of 10), whereas mean GCRS scores

for the‘good’communication vignettes ranged from 5.1 to 8.4.

Data collection

Ipsos MORI fieldworkers conducted data collection in collaboration with our team. As per the original protocol, we aimed to recruit 1120 respondents, each of whom was asked to rate three simulated

GP–patient consultations. Our original sample size calculation was based on data from the General Practice

Assessment Questionnaire (which includes some identical items to those in the GP Patient Survey); we repeated this using more recent GP Patient Survey data. This confirmed that the inclusion of 560 Pakistani

(on a 0–100 scale) seen between these two groups after controlling for age, gender, deprivation, self-rated health and practice. As our analyses of GP Patient Survey data had identified that ethnic disparities were largest among older age groups, we set out to recruit equal numbers above and below the age of 55 years

within each ethnic group.156

Vignette development Rationale

To vary vignettes on three domains: 1. the presenting complaint

2. the quality of GP–patient communication (poor or good)

3. the ethnic backgrounds of the doctor and patient (South Asian or white British)

Rating vignettes

• Each vignette rated by three trained GCRS raters (all GPs) to determine its score for the quality of communication in relation to professionally defined norms Clinical content

• Derived from existing bank of > 500 video-recorded GP–patient consultations • Identified consultations with unisex presenting complaints lasting < 7.5 minutes

(n = 29)

• Four consultations selected by the research team: tennis elbow, persistent cough, numbness and perforated ear drum

Vignette filming

• Vignette actors were recruited via an acting agency specialising in simulated patient role play

• Briefing packs were prepared for all actors, to include scripts, background summaries of vignettes and verbal and non-verbal behaviour guides • Vignettes were filmed with a professional film crew over 2 days, involving 10 actors and one acting supervisor. At least one GP was present at all times to ensure clinical consistency

• Following filming, vignettes were professionally edited to create the final 16 films Final vignettes

• 16 films ranging from 2 to 8 minutes

• Eight films involving a South Asian GP–patient actor pairing: four different clinical scenarios, each filmed in two versions, poor communication and good

communication

• Eight films involving a white British GP–patient actor pairing: four different clinical scenarios, each filmed in two versions, poor communication and good

communication Script development

• Each of the four clinical scenarios summarised for simulated patients on a pro forma covering patient sociodemographics, clinical details, patients’ perspectives, past medical history and social history

• Each scenario role played with a GP (JBe) and simulated patient in two versions – ‘good’ and ‘poor’ for communication. All role plays video recorded • Communication quality of each role-played consultation scored using the GCRS by one rater (JS) to confirm the difference between ‘good’ and ‘poor’

communication role plays at this stage

• Role-played scenarios transcribed in full to act as scripts for the vignettes. Minor changes to the content and stage directions were made by the research team in consultation with the simulated patient

FIGURE 14 Development of the vignettes. Reproduced from Burtet al.190_{under the terms of the Creative Commons} Attribution license (CC BY 4.0), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

EXPERIMENTAL VIGNETTE STUDY

NIHR Journals Library www.journalslibrary.nihr.ac.uk

Following consultation with Ipsos MORI, we used different recruitment strategies for the different ethnic groups. To recruit Pakistani respondents, we selected output areas (geographically confined areas of approximately 130 households) in which at least 35% of the population was identified as Pakistani in 2011

census data.173_{These were then ranked according to the proportion of the population aged}_>_{50 years}

(the cut-off point of 50 years of age used for sampling reflects available census categories; for our recruitment we specifically used a cut-off point of 55 years of age). Trained fieldworkers then recruited participants within these areas using an in-home face-to-face approach, starting in the output areas with the

highest proportion of residents aged>50 years. Fieldworkers were also provided with one or two output

areas neighbouring the area sampled and were able to recruit from these if necessary. Snowball recruitment (e.g. known neighbours suggested to fieldworkers) and additional household interviews were allowed. To recruit white British participants, we first excluded output areas with low proportions of white British

residents (<90%) and residents aged>50 years. The remaining output areas were ranked by social grade

(the percentage of people who were social grade A/B according to 2011 census data194_{) and geography.}

Ipsos MORI then selected output areas to approach using proportional systematic sampling.

Fieldworkers screened potential participants for ethnicity (using the ONS 18-group categorisation143_{) and}

for English-language competency (using a screening question regarding self-reported confidence in understanding short videos in English). Eligible respondents who consented then completed a computer- administered personal interview during which the fieldworker used a standardised script. Each participant viewed three of the sixteen simulated consultation videos that we had produced. Following each video, the

participant was asked to rate the consultation using five GP–patient communication items taken from the

most recent national GP Patient Survey (Table 23). We assigned videos so that each participant saw three

different presenting conditions (and, therefore, videos), with two of the videos featuring South Asian–South

Asian and white British–white British ethnic GP–patient pairs and at least one of the videos for each

condition featuring either the‘good’or‘poor’communication script. The selection of videos shown to each

participant was such that approximately equal numbers of all possible combinations were used, given the restrictions that we have described. Participants also completed basic sociodemographic questions (age, self-rated health, whether or not born in the UK, language spoken most often at home). An area-based

measure of socioeconomic deprivation (IMD) was recorded based on the participants’postcode.

Analysis

As in our previous analyses of GP Patient Survey data, we scored each participant’s rating of each

consultation by linearly scaling the response options between 0 (very poor) and 100 (very good) and averaging all informative answers when at least three of the five items were completed. We used linear

regression to model the mean difference between white British and Pakistani participants’ratings of

GP–patient communication. We estimated the unadjusted difference in ratings as well as the difference

adjusting for patient age, gender, self-rated health, deprivation and a set of 15 indicator variables for the video. We did not originally plan to conduct any analysis of interaction terms. However, the effect size

TABLE 23 General practitioner communication items used to rate vignettes Thinking about the doctor you have just seen in the

video, how good was the doctor at each of the following? Please put a✗in one box foreachrow

Very good Good Neither good nor poor Poor Very poor Doesn’t applya

Giving enough time _□ _□ _□ _□ _□ _□

Listening to the patient □ □ □ □ □ □

Explaining tests and treatments □ □ □ □ □ □

Involving the patient in decisions about his or her care □ □ □ □ □ □

Treating the patient with care and concern _□ _□ _□ _□ _□ _□

found was much larger than that anticipated in our original power calculations and so we investigated interactions between participant ethnicity and the following variables:

(a) relating to the video: ethnicity of GP/patient and quality of GP–patient communication

(b) relating to the participant: age, gender and deprivation.

When modelling interactions, we used only variables for the video attributes, rather than using indicator variables for all videos. For interactions involving age, the oldest two age groups were combined and a

continuous version of the age groups was used in the interaction term only. CIs andp-values were

estimated using bootstrapping with 500 replications (given non-normal data), clustered by participant (with each participant supplying three communication scores). We conducted a sensitivity analysis that clustered the bootstrap resampling by output area rather than by participant to account for multiple sampling in households and small geographical areas; however, this made only trivial changes to standard errors and we consequently do not report this here.

Results

Participants

We recruited a total of 1128 participants: 564 (50%) self-identified as white British and 564 (50%)

self-identified as Pakistani. The sociodemographic profile of the participants is shown inTable 24. Although

the sampling restriction that half of participants in each group be aged≥55 years increased the similarity

of the groups’age distribution, Pakistani participants were younger than the white British participants within

the sampled age strata. Pakistani participants were also more likely to be male (58% vs. 45%), to be in fair

or poor health (38% vs. 26%) and to live in the most deprived areas (82% vs. 14%).Figure 15shows the

geographical locations from where participants were recruited. White British participants were recruited from a wide range of geographical locations, whereas, as a result of our sampling approach, Pakistani participants were located from a small number of geographically confined locations. Between 202 and

222 participants scored each of the video vignettes for GP–patient communication (Table 25).

Main results

The distribution of communication scores for white British and Pakistani participants was skewed in both groups; however, the communication scores from Pakistani participants were typically higher than those

from white British participants (Figure 16). The mean communication score from Pakistani participants was

67.3 out of 100, 9.9 points higher (95% CI 8.0 to 11.7 points;p<0.001) than the mean score from white

British participants (57.4 out of 100). In a regression model (full output shown inTable 26) adjusting for

participant age, gender, self-rated health, deprivation and video there was a slightly larger difference

In document Improving patient experience in primary care: a multimethod programme of research on the measurement and improvement of patient experience (Page 117-129)