• No results found

Justification of automated assessment of bone Age

desirable.

One approach to the problem of interobserver variability is to use the same observer for all bone age assessments. In a survey of United Kingdom bone age assessments, approx- imately two thirds of the assessments were performed by three or more individuals in a institution [Buck83]. Apart from the scientific evidence for intraobserver and interobserver variability, there is some strong feeling in the literature about misuse of methods of bone age assessment. This is best summarised by Graham’s statement “valid skeletal age assess- ment presupposes a working knowledge of the fundamental concepts and tools involved, and an amateur interpretation is often worse than none at all.” [Grah72].

1.3

Justification of automated assessment of bone Age

1.3.1 The reduction of observer variability through automation

For bone age assessment to be of use in both clinical and research environments, it is im- portant that the bone age result be accurate, the results be reproducible between assessors (observers), and that the results be consistent with serial measurements of bone age. As- suming an automated bone age system can reasonably duplicate the tasks of a human assessor, then subjectivity of the assessment would be eliminated as a possible cause of ac- curacy loss. The implementation of a hybrid system combining a range of skeletal maturity indicators from the different methods should address accuracy problems due to inherent limitations in the bone age assessment method. Finally, having an automated system high- light problems with the dissociation of bone maturity within a single radiograph would provide an opportunity for an expert assessor to intervene in the assessment.

The findings of large intra- and interobserver variability and the unresolved questions of what variability is considered acceptable are motivation for development of an automated system for bone age assessment. If such a system does not involve random processes, such as Monte Carlo methods, and the algorithms are stable and tolerant of small changes in radiographic features, then there is no reason to expect an automated system to produce variability between bone age assessments. This should be the case for both reassessment of the same radiograph, and assessment of pairs of repeat radiographs at the same age. The objective of the automated system would be to eliminate observer variability, to leave hu- man maturational variance as the main contributor to uncertainty in bone age assessment.

1.3.2 The feasibility of using novel measures of bone age

The significant differences in skeletal maturity indicators for the TW and Fels methods, and the differences in how the skeletal maturity information is combined, suggests the possi-

bility of other, novel measures of bone age assessment that have not yet be developed. For example, Roche et al. found through their extensive analysis of skeletal maturity indicators that there were 13 indicators in boys that were not useful when applied to girls [Roch88, p57]. They decided to drop these 13 indicators from the Fels method and this resulted in 98 indicators that could be used for both girls and boys. However, it is important to note that their reason for looking at the indicators separately for boys and girls was they thought that some of the indicators might have been influenced by differences in muscular devel- opment specific to boys [Roch88, p48]. Although they did find differences between girls and boys, they chose not to include them, probably in the interests of simplifying the Fels method. This is an example of where automated analysis would allow extra indicators to be incorporated into the system, and could even allow indicators to be added that were difficult for a human assessor to determine (for example, quantitative measures like ratios of areas). An automated system could easily cope with the complexity overheads of extra indicators as long as computational overheads did not become prohibitive. Ideally, an au- tomated system should have the flexibility to allow new measures of bone age to be added. Existing, manual methods are much more difficult to modify, mostly because of the need to retrain the person performing the assessment. Furthermore, there is the possibility of measures that a computer is good at quantifying but not a human; measures such as area and contour angularity.

As mentioned in Section 1.3.1, one of the concerns about bone age assessment is when there is a lack of maturity consistency between individual bones [Tara76]. An automated system could provide data management facilities that would make it possible to highlight bones that show dissociation with other bones in the hand and wrist. Furthermore, a system could track both individual bones and groups of bones across serial investigations. In this respect, differences between bones could be useful, rather than simply trying to assign a single bone age to the whole radiograph.

In addition to new measures of skeletal maturity, an automated system could allow changes to the methods by which bone ages are assigned to skeletal maturity scores or other mea- sures. For example, the biological weights used by Tanner et al. in the development of the TW method did not take into account the duration of the stages of the bones [Tann01, p6]. They thought that a bone should have a lower weight if it lasts in a stage for a long time. If a bone passes through a stage quickly, then it should have a higher weight. This concept was used in the Fels method [De L99], but Tanner et al. felt that a system that included this extra component of variable bone weights would be troublesome to use. This type of change to a bone age method would be much simpler to incorporate into an automated system.

1.3 Justification of automated assessment of bone Age 27

1.3.3 Population specific datasets

There are two aspects to the need for population specific datasets. Firstly, much of the data used in older bone age assessment methods was based on historical series of radio- graphs for particular populations [Cox97]. There has been a worldwide trend of children maturing earlier and a call for new reference standards has being made [Grah72, Cox97]. Secondly, some investigators believe that the use of a particular bone age method should be restricted to children who share the same socioeconomic and genetic characteristics as the reference population from which the method was developed. Both of these aspects of the reference populations have been addressed in research, yet many studies appear to raise more questions than provide answers. For example, a literature review of the effects of eth- nicity on skeletal maturation found that there was little influence of ethnicity on bone age assessments, but there was a socioeconomic effect [Schm00]. In a study of the applicability of the reference population for the GP method, different ethnic groups – Asian, Hispanic, Black, and White - had statistically significant differences in bone age and chronological age across all groups, except Asian girls [Onte96]. Other population-specific differences have been demonstrated by another group who found that in 1986 Japanese children were maturing earlier than children in Belgium, United Kingdom, and South China when as- sessed using the TW2 RUS skeletal maturity score [Mura97]. These examples are typical of the conflicting findings in the published literature.

One approach to the problem of difference in reference populations would be to use a single method of skeletal maturity assessment across a number of different populations of children, and to look at a measure of skeletal maturity that is independent of any one reference population. Tanner et al. proposed that the skeletal maturity score from the TW2 (TW3) method was a measure of skeletal maturity that was independent of the reference population [Tann83, p9]. This has been backed-up by other investigators who believe that because skeletal maturation is orderly across populations, even in those with malnutrition, a skeletal maturity score may be a more appropriate measure of skeletal maturity than bone age [Roem97]. Instead of trying to relate skeletal maturity to equivalent development in normal children of a specified chronological age, the skeletal maturity scores are used directly. The basis of this is that the score only relates to radiographic appearance of the bones and instead of determining a score - then an equivalent bone age and working out if the bone age is normal - normal ranges are calculated for skeletal maturity scores directly. Hence, two children with the same score should be at the same level of skeletal maturity, independent of the population from which they are drawn.

Establishing normal ranges for skeletal maturity scores, or processing new reference pop- ulations for existing methods would require a lot of work. Although there would be much effort required in recruitment, clinical measurements and demographics collection (includ-

ing the challenging tasks of socioeconomic and ethnicity assessment), a significant amount of work would be required for the bone age assessments. This task could be simplified by an automated bone age system that would allow for the processing of large numbers of radiographs, without the need for duplication of assessments (‘second reading’ of the radiographs).

1.3.4 Development of a research tool

Use of an automated system to improve the accuracy and precision of bone age measure- ment may help with investigation of questions surrounding the use of the assessment for monitoring skeletal maturity during growth hormone replacement therapy. It is expected that an automated system would reduce the costs of bone age assessment through a reduc- tion in time that radiologists would need to report on bone age radiographs.

Part of the problem with the interpretation of research studies involving the use of bone age is the uncertainty in the results. Using an automated system to reduce uncertainties due to the bone age assessment may improve the power of the study and reduce the number of participants required to achieve a significant result.

Using currently available x-ray tube and x-ray generator technologies, combined with dig- ital x-ray capture devices, it should be possible to construct a very low radiation dose, portable, automated bone age assessment system for screening of children. Roche et al. state that “assessment of skeletal maturity is an important part of any epidemiological study involving the physical status or performance of children” [Roch88, p10]. A valid and reliable automated bone age assessment system could be an important tool in many research studies.