Examination of the models against moving averages of the raw data showed that there was a lack of fit for the older children with the cui-ves appearing not to ’flatten out’ fast enough. A Box-Cox transformation of age was added to the symmetric logistic but this had little impact on the likelihood (maximised likelihood for 6 parameter age
transformed model = -599.50). A power paiameter, p^, was added to allow the cui*ves to be asymmetric
P„ + 100
The fit improved significantly and the likelihood was increased to -595.10.
The seven parameter standard and inverse polynomials, the symmetric and the
asymmetric logistic models are shown plotted on a logit scale, together with the
(smoothed) raw data, in figure 4.3.
Figure 4.3 : Comparison o f the models fitted
(125 point moving average shown)
O)
o
6
M odel num ber5 4 3 2 1 0 1 2 9 7 2 3 4 5 6 8 Age (years)
Model IrSlandnrd polynomial; 2:Inver.se polynomial: 3:Symmetric logistic: 4: A sym m etric logistic
Although the asymmetric logistic clearly fit the data better, the model converged to a fit with an infinitely large value of the ’slope’ parameter, p.,, being counteracted by an infinitely small value o f the power parameter, p^. What this indicates is that in the best fitting model, the rising curve has to ’flatten out’ infinitely fast, giving what is effectively two straight lines on the logit scale reaching asymptotes at different ages. pT was therefore fixed at some suitably large number, to give the effect o f a
discontinuity where the asymptote is reached. The fitted equation had parameter estimates ^ 2 = 2 5 (fixed), 0 1 ,3 / 3 = 2 . 8 9 8 , 0 3 ,3 / 3 = 6 . 2 0 2 ; 0 1 ,3/4 . 5 = 4 . 3 2 4 ^
03 3 / 4 5 = 5 . 0 3 3 , 04 = 5 . 0 9 7 x 10^. The inflexion points were significantly
different (Hq: P3 3 / 3 = P3 3/4.5 , X(d = 4 . 8 8 ) .
A binary parameter representing dataset was added to the model. This did not significantly improve the fit ( % n j = 0 . 5 1 ) .
4.3.1 Goodness-of-fit
The final model (asymmetric logistic) is compared to the raw smoothed data on a probability scale in figure 4.4.
1 . 0 - r .8- § 64 o Ql
2
G_ . 4 - . 2Figure 4.4 : The asymmetric logistic model on a
probability scale (125 point moving average shown)
0.0 .
Age (years)
Given that the population values are expected to be non-decreasing with age, then the fit appears to be adequate.
Note that on this scale the asymmetric logistic transforms to a steadily rising curve which flattens at the age where all who will achieve the category have done so. Therefore the model is intuitively appropriate from a clinical perspective.
4.3.2 Achieving specified failure rates at chosen ages
T here w ere tw o c a teg o r y d ep en d en t param eters in the fin a l model, Pjy and These were modelled as smooth functions o f letter size
=
-100
(Pw. + 100) (1
p.„.
(4.10)
In the absence of any extra information, g, and g? were considered to be linear. Hence, an estimate of the letter size, that would result in a chosen proportion, f , passing at a given age, t, is given by the solution to the equations:-
( P u +100)
logitiP) = -1 0 0Pii =P
1 ,.V3 4 .5 -3 ( . ^ 1 ( 0 1.3 / 4 . 5 " 0 1.3/3) ' 4 .5 -3 (4.11) 0112) (03J/4..S-03J/.3)- (4 .1 3 )Figure 4.5 shows how the letter size required to give pass rates o f 85, 90 and 95% relates to the age of the child. The estimate of letter size requhed to identify the worst 10% at 4 years and 9 months is 3.59.
Figure 4.6 sliow.s the distances resulting in 90 and 95% pass-rates using letter-sizes o f 3 and 4.5. The graph sliows that standing a distance 2.50 metres from, letter size 3 or, alternatively, 3.75 metres from letter size 4.5, gives the same refeiral at age 4 years 9 months as letter size 3.59 from a distance o f 3 metres.
Figure 4.5 : Letter sizes giving 85, 90 and 95% pass-rates
assuming the model parameters are linear functions of letter size
4.51
3.5-
95%
3.0-
2.5-
90%
2.0-85%
3 4 5 6 7 8 Age (years)Figure 4 .6 ; D ista n ces (m etres) from letters o f siz e s 3/3 and 3 /4 .5 g iv in g 9 0 and 95% pass-rates
90%
4.5
95%
90%
95%
Age (years)
4.3.3 Confidence intervals
The ages at which the asymptotes are reached are given by the . The proportion o f children finally achieving each acuity level j or better is given by — 1 The
l + e ' ^
age at which a given proportion P.y reach each acuity level j or better can be calculated from where p:
vvTty+100
-1
r|; = lnr
\
p.
J\-p.
The estimated letter size which a specified proportion w ill be able to pass at a given age can be calculated from equations 4.5. The distance o f measurement from a letter o f a specified size that w ill lead to a chosen pass rate at a chosen age can be evaluated using equation 4.4.
Estimates and confidence intervals are given in table 4.3. According to the estimates, approximately 95% eventually achieve an acuity level o f 3/3, the normal for an adult. This percentage could have been built into the model as a constraint to ensure that the standards obtained were compatible with known adult values. This process w ould be as described in sections 2.3.3 and 2.4.3 for continuous outcomes.
All o f the confidence intervals seem reasonably narrow, with the ages being estimated to within ±3 to 6 months.
Table 4.3 : Functions of parameters of practical interest : estimates and 95% confidence intervals, based on asymmetric logistic model.
Acuity level 3/4.5 3/3 Estimate (95% ci) Estimate (95% ci) Age at which the maximum percentage achieve each acuity j
level !
5.03 (4.60, 5.58)
6.20 ( 5.85, 6.60) Proportion finally achieving that acuity (or better) | 98.69%
(97.84, 99.32)
94.78% (92.44, 96.55) Age by which 90% achieve the acuity | 3.42
(3.13, 3.67)
5.67 (5.46, 5.87) Distance (metres) leading to 90% pass rate at 4 yrs 9 j
months* !
3.75 ( 3.63, 3.85)
2.50 (2.42, 2.57) Letter size which gives a 90% pass rate at age 4 yrs 9 1
months from a distance of 3 metres* 1
3.59 (3.50, 3.72) * Assuming a linear relationship between the model parameters and acuity level
4.3.3.1 Sample size estimation
The estimated letter sizes yielding a 90% pass rate at ages 4 years, 4 years 9 months and 616. years were calculated using equations 3.6 - 3.8. Profile likelihood confidence intervals were constructed for these letter sizes and the limits expressed in terms o f the percentage referred. The data set was duplicated (giving a total sample of 2280 data-points with each point replicated once) and the process repeated. By similar duplications o f the dataset, the original sample was artificially increased to 4, 6 and 8 times the 1140 values, and the intervals re-calculated. The results o f these analyses are given in table 4.4.
Table 4.4 : Profile likelihood 95% confidence intervals for the pass rates using increasing sample sizes
A ge o f testing
Sample size 4 years 4 years 9 months 6 years 6 months 1140 (85.54, 92.27) (87.90, 91.78) (85.42, 93.33) 2280 (88.28, 91.61) (88.57, 91.32) (86.92, 92.50) 4560 (88.80, 91.16) (89.00, 90.94) (87.91, 91.48) 6840 (89.02, 90.95) (89.19, 90.77) (88.32, 91.32) 9120 (89.15, 90.82) (89.34, 90.71) (88.45, 91.21)
Figures 4.7 and 4.8 illustrate how the limits converge as the sample size is increased. There is considerably more uncertainty near to the inflexion point. The changes in the interval widths with increasing sample size were mostly in agreement with those that theoretically predicted at a single age. That is, increasing the sample size 2,4,6 and 8- fold would be expected to decrease the width to 0.7, 0.5, 0.4 and 0.35 o f its’ original width.
0) N
C O
0)
Figure 4 .7 : Letter s iz e s g iv in g 8 5 ,8 8 ,9 0 ,9 2 and 95% referral across the age range
5 .0 1
4.5-
4.0-
3.5-
9 5 % 3 .0 - — 9 2 % — 9 0 % — 88% 2 .5 - 2.0-85%
5 6 Age (years)Figure 4.8: C o n fid en ce lim its obtained at a g es 4 , 4 .7 5 and 6.5 y ea rs u sing data sa m p le s o f s iz e s 1 1 4 0 ,2 2 8 0 ,4 5 6 0 ,6 8 4 0 and 9 1 2 0
(limits joined across ages using splines)
5.0-1 4 .5 - 4 .0 - 3 .0 -
2.5-
9 0 % 2.0- 2 3 4 5 6 7 8 9 Age (years) uv4.4 DISCUSSION
In this chapter smoothly changing age-related standards, and confidence intervals for parameters of practical interest, have been produced using ordinal logistic regression and maximum likelihood techniques. One particular shortfall o f ordinal outcomes was highlighted by the acuity data. It is at present recommended that children have their acuity tested during their first school year, aged between 4 and 6 years (Hall, 1989). Before 6 years o f age, a substantial minority o f children do not have fully matured vision and may present as having a problem using the current adult standard cut-off level o f 6/9, or equivalently, 3/4.5. The result is that excessive numbers o f school starters may be referred for further, time-consuming and costly, tests. An alternative would be to use the cut-off o f 3/6 at an earlier age. By the age o f 3V^ years, 90% of children w ill have acuities of 3/4.5 or better, so using this cut-off w ill result in the worst 10% being identified. Although this would be acceptable, ensuring the attendance of all 3 Vi year olds at a sight test would have many practical and administrative difficulties, and between 3Vi and 6 years there is no suitable test.
There are two ways in which the age-related nature o f the acuity data may be dealt with to maintain the required levels o f referral throughout a range o f ages spanning 3 to 6 years. The first is to change the letter sizes, possibly creating one new letter size to be used for testing children when they start school, from a distance o f 3 metres. The second way is to use a standard letter size, but to change the distance o f measurement according to the childs’ age. This alternative is preferable because it would be easier to implement on a practical level. Children do not all have to be tested at exactly the same age and/or the size of the letters used for testing purposes do not have to be changed according to the childs’ age. The information regarding the distance required could easily be transferred to a tape marked out according to age. This tape could then be sent to schools for use with the SSAS.
There was insufficient data available to ascertain the relationship between the categories and age. Children with acuities in the worst categories, 3/9, 3/12 and 3/18 or worse, are rare. The sample size would have to be increased considerably to ensure enough children in these categories to allow formal testing o f the relationship. A s an illustration o f the potential for creating standards which involve intermediate categories to those tested, the model parameters were assumed to be linearly related to the letter sizes. For other applications, there may be sufficient data across a range o f categories to allow the accurate determination o f standards related to category.
Like many other paediatric measurements, acuity appears to change continuously before asymptoting at the adult level. Furthermore, any ordinal milestone outcome would be expected to show a non-decreasing relationship with age. These features were sharply evident in the acuity dataset, ruling out polynomial models in age, and forcing the use of non-standard software to estimate non-linear ordinal data models. The flexibility afforded by constrained likelihood maximisation, in constructing confidence intervals both for the parameters and functions of parameters of particular interest, also required specialised software. The techniques shown could easily be extended to situations with sex or other group differences in the parameters o f They may also have useful applications creating age-related standards for other ordinal outcome data.
It is o f interest that the changes in confidence interval widths when the sample size was increased were in accordance to those expected for an increase at a single age. The implication is that the calculation o f a sample size to give a specified precision, when some estimate o f the centile variability is available, is a simple matter. Cole, 1990, uses this approach to sample size estimation. However, since he did not have a single likelihood to be maximised, it was necessary to introduce a somewhat arbitrary ’scaling factor’. The exact figure used depended "on how the smoothing was
done, and whether the particular age group is near to the centre or the extremes o f the age range studied. For this reason a precise value cannot be given, but using a scaling factor o f 2 or 3 ought to be safe, if conservative". Although sample size estimation w ill never be an exact science, the removal o f som e o f the arbitrariness via the maximum and profile likelihood approach to centile estimation must ultimately be advantageous.
CHAPTER 5
CONCLUSION
The general structure and development of this thesis was outlined in the preface. In chapter 1 both the past and present literature is reviewed. The problems associated with the construction of adequate age-related normal ranges have been recognised for almost half a century. Initially the limiting factor was a lack o f availability o f software for complex model fitting. More flexible methodology has been developed for the construction o f age-related standards as a direct result o f increased software capabilities. In more recent years custom built packages specifically for the purpose o f stand^d construction have been made publically available (for example: GROSTAT (Rasbash et al., 1991)). Similarly it is now possible for the clinician to accurately calculate and chart z-sc ores for an individual using automated procedures (for example z-scores can be automatically calculated for some growth parameters within an Excel spreadsheet (Child Growth Foundation, 1997)). Such automation has in turn lead to increased interest in the creation o f more precise centile charts.
Several methodologies have been proposed for the construction o f age-related standards. The different approaches are presented and discussed in chapter 1. More recently interest has focused on comparison between different methods for specific datasets (for example: Wright & Royston, 1997, Bonellie & Raab, 1996). The approach taken in this thesis was to make comparison on the basis o f the intrinsic properties and expected performance on a range of pre-determined criteria rather than the goodness-of-fit achieved on selected datasets. The review aimed to identify where and why each method may be expected to give more accurate characterisation o f the population centiles. Whilst comparison using specific datasets may provide some insight into potential differences it cannot provide the definitive answer as to which method gives a better fit generally, no matter how many specific datasets are considered.
The discussion of which method to use when may depend on the mode o f data
collection. There are two possibilities. One is that the dataset has already been collected for another purpose and standards need to be constructed from it. The other is that data is to be collected primarily for centile construction. In the latter case the methods o f data collection can be specifically geared towards rectifying the potential shortcomings of the proposed construction method. For example, when Rusconi et aL,
1994, used the non-parametric approach to construct standards for respiratory rates in the first 3 years o f life they oversampled near to birth and included infants older than 3 years o f age.
In chapters 2-4 several methods have been employed and combined to give a flexible format for creating centiles. This combined method is illustrated for both continuous and ordinal outcomes and has been shown to have many desirable properties. A natural extension to the use o f likelihoods for estimation of centiles is the use of profile likelihood to create confidence intervals around the fitted centiles. The centiles can readily be constrained in various ways to incorporate known values (e.g. adult, birth). Incorporation of the correlation structure between repeat measurements means that a potentially biased available dataset may be used to create unbiased centiles without the need to reduce the dataset.
There was a discussion in section 1.3.1 o f the curve forms that could be used to model either the centiles or the changes in the median, spread and skew. In this thesis, exponential models have mostly been used as the basis for the models. These are compared to standard polynomials (chapters 3 and 4) and to a subset o f fractional polynom ials (chapter 4). The exponential models were found to be appropriate for the datasets presented here. Paediatric values are usually expected to asymptote at adult levels and the exponential models satisfy this requirement. However exponentials are non-linear and require fairly specialist software to fit the models. By using linear models the methodology would be more transportable. Royston & Altman, 1994,
discuss the flexibility of fractional polynomials and present an algorithm for finding the best model. Approximate significance tests can be performed between the models and the model set restricted to include only those models which asymptote. Employing fractional polynomials rather than exponentials may simplify the computation and provide a wide choice of appropriate model shapes within a unified framework. This would greatly assist construction of a general program for use with varying datasets. However, the complete automation of model choice within a general framework may lead to unnecessarily complex curve forms being used where there is a more reasonable and simpler model. For example, in chapter 4 the asymmetric logistic model was only considered after examination of the data on the logistic scale. It is not a widely used curve and it is unlikely to have been included as an option in any automated model selection procedure. Examination of the fitted model shows the asymmetric logistic to be ideal for data of this type. The steady rise and sudden flattening to an asymptote when all who are going to achieve each level of the ordinal outcome have done so (see figure 4.4, page 103) makes an appealing model. It is doubtful whether any single fractional polynomial could achieve this curve shape.
The computation involved in the methods used in chapters 2-4 is no more complex than for other methods which have been developed into custom built packages for general application. Hence a software package could be developed using the maximum likelihood approach. This could incorporate options for ordinal or continuous outcomes, allow for a variety of models and comparison between models via maximum likelihood tests, create confidence intervals around the fitted centiles etc. etc. However, before embarking on this enterprise it may be prudent to ask just how useful such a package would be regardless of how user-friendly or adaptable it ultimately is. There must be a limit to the number o f parameters for which age-related standards are truly clinically useful. Historically the introduction o f easier computation o f com plex procedures has led to instances of inappropriate application of techniques.
There are at least some recent examples of published standards that are based on a relatively small number o f measurements and these are clearly highly imprecise (for example Laudy et a i , 1995, van Splunder et a i , 1995,Richards & Farah, 1994). In