Plain Language Summary of the Evaluation of GP Specialty Training

3. The main investigation into reliability of the selection system is a multi-facet Rasch model which investigates the extent to which differences between stations (the three simulations), questions, clinical and non-clinical attributes being assessed, examiners, role players and testing centres affect the outcome of the assessment.

We will estimate the predictive validity of the different elements of the selection system, separately and in combination. We will also look at differential validity for different groups of trainees; for example, do trainees who graduated abroad do better or worse in the summative assessments than expected given their selection scores?

WORK PACKAGE 2: COST AND ECONOMIC EVALUATION

The aim of this work package is to estimate the cost-effectiveness of the selection process. Such work is crucial to justify the extensive investment in the selection process. We will look at the potential impact of modifications to the current process e.g. altering the pass marks applied to selection at Stages 2 and 3; or using both Stage 2 scores as part of the final decision- making process.

We will consider the costs of: selection; unfilled posts; extensions to training; and trainees who do not qualify as GPs incurred over a five year period for a single cohort of doctors entering GP training. We propose a cost-effectiveness approach to the economic evaluation, using the number of trainees achieving CCT as the outcome measure. This analysis is based on the effectiveness of the selection process to determine training outcomes and will estimate the cost per trainee achieving CCT. We will undertake sensitivity analyses to consider the implications of potential variations in different cost and effectiveness estimates. Such work will enable us to determine which inputs and/or outputs of the selection process are most influential in terms of cost-effectiveness.

In an ideal world, we would also undertake a cost-benefit analysis (CBA), which would enable us to estimate the rate of return to investments in selection. However this approach requires the application of a number of assumptions regarding the value of health care provided by trainees and the variation in this value amongst the cohort of doctors applying for GP training. We will consider the quality of the data available for these variables but do not envisage that the results of a CBA would be sufficiently robust to be useful to inform practice.

WORK PACKAGE 3: SURVEY DATA

This work package addresses educational impact and acceptability by adding questions to the surveys that are already being used in the selection system. We propose undertaking a brief literature search for similar questionnaire data from other specialties to compare their acceptability for candidates. The added questions will ask how candidates prepared for GP selection and their assessment of its wider educational value. We will also seek to understand why recruitment is less successful in some parts of the country. Similarly, it is important to know if candidates regard GP as their first choice and their reasons for this.

Now that applications for specialty training are all undertaken with Oriel, we intend to compare success at various stages of the selection process between GP and other specialties, for trainees who apply to more than one specialty. This may provide evidence regarding differences between specialties in terms of difficulty and popularity.

WORK PACKAGE 4: SYNTHESIS

Our synthesis will focus on illustrating how changes to the selection process could improve outcomes. The nature of these potential changes will depend on our initial findings, but here are some possibilities:

• If differential prediction analysis demonstrates differences in the relationship between selection and end of training assessment scores between parts of the country or groups of trainees (such as by gender), then modifications to the selection process may be recommended to improve fairness.

• Increasing the number of scenarios in Stage 3 of selection may be cost-effective in terms of reducing the number of trainees who have difficulties completing GP training. Alternatively, Stage 3 is a very expensive process and it may be that its incremental increase in predictive validity is too low to justify its inclusion, particularly if the predictive validity of Stage 2 is high.

• The cost-effectiveness of the selection system could be improved by including both Stage 2 scores in the final selection decision if this change increases predictive validity.

To our knowledge, we are using analytic techniques that are rarely or never used to this level of sophistication within medical education:

• Reliability: due to difficulties with data that are not fully crossed or nested, use of generalizability theory is usually restricted to very few factors. Consequently our proposed use of a multilevel framework increases the factors that can be included in the analysis. Likewise, multi-facet Rasch modelling allows examiner, simulator and case differences to be modelled in a sophisticated manner.

• Validity: differential validity and differential prediction analyses will allow us to explore issues related to fairness in greater depth and provide more accurate estimates for the economic analyses than could be otherwise undertaken.

• Economic analyses: our approach to evaluating cost-effectiveness provides a high-quality method of estimating the impact of potential changes to selection.

Appendix 1.2

In document Evaluation of GP Specialty Selection (Page 39-42)