Reducing test-retest variability - Visual acuity test limitations

1.2.4 ‘Gold standard’ visual acuity testing

1.3 Visual acuity test limitations

1.3.3 Reducing test-retest variability

TRV can be influenced by a number of factors. It has been shown to generally increase in the presence of optical defocus (Rosser et al., 2004, Carkeet et al., 2001, Elliott and Sheridan, 1988) and ocular pathology (Patel et al., 2008, Laidlaw et al., 2008). Other influences include the test chart design, test termination criteria, the scoring techniques used, as well as inter-examiner variation (Gibson and Sanderson, 1980). Some of these factors can be minimised by following a standardised testing protocol (Klein et al., 1983) and adopting good testing procedures, e.g. recommendations include measuring VA using the best corrected refractive results. In a group of normal subjects, Rosser et al., (2004) found TRV values ranging from +/-0.11 logMAR for 0 dioptre (D) defocus to +/-0.25 logMAR for 1D of defocus. Several studies have shown the benefit to TRV by taking the mean of multiple repeat measures of VA (Rosser et al., 2003b, Shah et al., 2011b), the disadvantage evidently being the trade off in test time. Testing should be conducted under optimum and consistent test lighting conditions since VA is known to be proportional to chart luminance. Sheedy et al., (1984) recommend that chart luminances should be in the range

80–320 candelas per square metre (cd/m2_{) since they found differences of only}

0.02 logMAR when the chart luminace is doubled within the range 40–600 cd/m2_{. British Standards advise that the minimum background luminance of}

internally illuminated charts should be a minimum of 120 cd/m2_(British

Standards Institute, 2003).

Forced choice test procedures should be adopted to ensure that subjects are pushed to similar levels each time, reducing differences related to the individual’s response criterion or bias and to ensure that termination criteria are fully satisfied. Arditi and Cagnello (1993) suggest that subjects should be required to read the entire chart in order to reap the benefit of the probability of a correct guess. With a two alternative letter forced choice (2AFC) there is a 50% chance of a correct guess compared to 10% with a 10AFC. Carkeet (2001) discussed how this guessing on subthreshold lines can actually introduce further variability to the measurement and describes how, in the worst case scenario, with a 2AFC and termination criteria of five mistakes on a line, a 96.9% chance of proceeding to the subsequent line results in a significant spread of lines where a subject may stop. This transition zone from seeing to non-seeing can be described by probit size (Carkeet et al., 2001) with probit analysis confirming a larger effect on VA of termination criteria and a larger probit size in those with small amounts of optical defocus compared to well corrected subjects. Carkeet (2001) combined exact calculation and Monte Carlo simulation to investigate the effect of the different nAFC available and test termination rules on the mean and SD of logMAR scores, and proposed a number of clinically suitable termination rules for different nAFC. Of

importance here are the recommendations made for Bailey-Lovie and ETDRS charts which have a 10AFC (although the observer may not be aware of this) to use a termination rule of four-or-more letters wrong per line for optimum slope corrected SDs.

The influence of scoring techniques on TRV has been examined. With Snellen charts, the line-assignment scoring technique is usually adopted whereby the VA score is taken as the smallest line on which (conventionally), the majority (more than 50%) or 70% (NAS-NRC Committee on Vision, 1980) of the letters are correctly identified. With this technique, patients are given credit for lines and not letters that are correctly identified. Line scoring with the Snellen chart has shown to result in large values of TRV and the Snellen chart is recognised as being a poorly repeatable test (McGraw et al., 1995). With the advent of logMAR design VA charts with rows of equal numbers of letters and systematic changes in line size and spacing, TRV scores significantly improved using the line-assignment scoring technique.

A single-letter scoring protocol was later introduced by Ferris et al., (1982) in the ETDRS study whereby credit is given to each and every single letter read correctly in the final calculated VA score. Each letter is assigned a value of 0.02 logMAR based on the calculation that each line of five letters is equal to 0.10 logMAR. See Section 2.2.5 for details on scoring. Single-letter scoring has resulted in further improved TRV values over the line-assignment technique for logMAR charts by effectively making the grading scale five times finer. The work of Bailey et al., (1991) demonstrates that a coarser grading scale results in an increase in the SD of the discrepancy distribution, in turn extending the

95% confidence limits for change, supporting the notion that a finer grading scale results in an increasing ability of the clinician to detect change in the assessed parameter. Whilst the method of single-letter scoring can be achieved with the Snellen chart, it is much more complex owing to the differing number of letters per line, even though it results in lower TRV values. Table 1.1 illustrates the improved TRV values for logMAR over Snellen charts and further improved TRV using single-letter rather than line-assignment scoring as published by different research groups.

Increasing the number of letters per line whilst employing single-letter scoring can further enhance TRV (Laidlaw et al., 2003, Rosser et al., 2001, Bokinni et al., 2015) with a recent study by Shamir et al., (2016) demonstrating improved reproducibility in VA scores by increasing the number of letters per line up to seven. Raasch et al., (1998) reported that an increase in the number of letters per line by a factor of ‘n’ can improve precision of VA measurements by a factor of √n. This does of course increase test times and a viable balance has to be reached.

Table 1.1: Previously reported TRV values for Snellen and logMAR charts, using either line-assignment or single-letter scoring techniques in different subject groups (all TRV values in logMAR).

Thus in summary, while a lower TRV can be achieved by optimising test conditions and adapting recommended test procedures, it can further be lowered by increasing the number of measurable increments both by:

a) Reducing the step size between lines by using a logMAR rather than Snellen chart. The Snellen chart typically has 9 steps in the range 6/60 to 6/4 where as the logMAR chart has 13 steps for this same VA range. b) Employing single-letter rather than line-assigment scoring techniques.

This results in a grading scale that is five times finer.

Research Group Subject Group

Snellen Chart LogMAR Chart

Line Letter Line Letter

Rosser et al., (2001) Cataract,

pseudophakia, early glaucoma

+/-0.33 +/-0.24 +/-0.18

Laidlaw et al., (2003) Amblyopic children +/-0.30 +/-0.29 +/-0.20 +/-0.14

Lim et al., (2010) Mixed pathology +/-0.18 +/-0.14

Elliott and Sheridan

(1988) Normals +/-0.12 +/-0.07

Bailey et al., (1991) Normals +/-0.20 +/-0.10

Vanden Bosch and

Wall (1997) Normals +/-0.10 +/-0.07

Arditi and Cagenello

Furthermore, the number of alternative letter choices and test termination criteria can also impact on TRV and acuity threshold values and should be considered. It is important to recognise that whilst scores for TRV have improved with the use of logMAR and in particular ETDRS charts, Table 1.1 demonstrates that TRV values of up to 2 logMAR lines can still be observed which has important implications for the monitoring of disease progression and treatment efficacy.

Any useful clinical test should provide a favourable signal-to-noise ratio such that the power of the disease signal provided by the test should not be lost in the background test noise. Whilst adapting VA test design features and adopting standardised testing procedures has improved on this test noise, there appears to be a limit on how much this can be improved. Indeed, Stewart et al., (2006) found no further improvement to TRV on using a randomly interleaved double-staircase technique, using the Sloan letter set in 0.02 logMAR size increments in which the threshold acuity was crossed ten times, to VA measurements attained using a standard ETDRS chart (+/-0.13 versus 0.11 logMAR respectively) in children. The next section looks at the choice of letter styles and sets employed in VA tests and the effects these can have on VA thresholds and variability measures.

In document Visual acuity in normal and diseased eyes using high-pass filtered optotypes (Page 37-42)