EVALUATION
OF
THE
SKELETAL-AGE
METHOD
OF
ESTIMATING
CHILDREN’S
DEVELOPMENT
III. Comparison
of
Measurement
and
Inspection
in the
Assessment
of
Roentgenograms
By
Donald Mainland, D.Sc. (Med.)Department of Medical Statistics and Study Group on Rheumatic Diseases, New York University College of Medicine
(Submitted May 1, accepted June 11, 1957.)
This investigation is part of a study of age differences in the bones and joints of children and adults,
supported in Canada by the John and Mary R. Markle Foundation and the Division of Medical Research of the National Research Council of Canada, in New York by grant A-104 from the National Institute of Arthritis and Metabolic Diseases, U. S. Public Health Service, and by a grant from Eli Lilly
&
Companyfor the study of normal values in anatomy, physiology and biochemistry.
ADDRESS: 550 First Avenue, New York 16, New York.
C C ATURATION does not mean sheer
in-crease in bulk.”15 Thus Todd sum-inarized one of the chief reasons for his de-tailed study of what he called “qualitative”
changes in roentgenograms of children’s
bones, a study that resulted in his hand
atlas2 and its successor, the Greulich-Pyle
atlas,3 for the assessment of skeletal age by
inspection.
Just as a man’s head is not merely the
enlarged head of a child, so the capitate bone in the adult is not a mere enlargement of the original spherical or ellipsoidal
os-sific center; nor is epiphyseal union to be
expressed merely by the thinning of the epiphyseal cartilage plate. Todd therefore dismissed as misdirected effort the attempts (well summarized by McCloy4) to assess development by measurements of carpals and epiphyses, or of carpal area. Neverthe-less, after a bony center has appeared, its changes, utilized for inspectional assess-ment, are changes in size, shape and pro-portions, as it spreads into the preformed cartilage; and even shape is a quantitative relationship. The dictum attributed to Gali-. leo-”Measure what is measurable, and
what is not measurable make measurable” -must appeal strongly to those who are aware of the weaknesses of the inspectional method, illustrated in Parts I and II of
#{176}The text reads “increase in build,” but “bulk” is the obviously intended meaning.
these studies5’ 6-differences between asses-sors, between assessments of the same films by the same assessor after a short interval
(
variable error) and systematic change in an assessor’s standards during a year’s lapse of practice.PURPOSE OF THE INVESTIGATION
The change in shape of a maturity mdi-cator (carpal bone or epiphysis) might be
expressed, for example, by a function con-taming length, breadth and an oblique di-ameter, or containing the area and the per-imeter. Phalangeal epiphyses, being small, would present difficulties, not readily over-come by magnification, which blurs their outlines. It would be still more difficult to express by measurement such features of incipient epiphyseal union as “faintly bil-lowed contour” and “reciprocal parallelism” described by Todd.2
The study reported here employed a much simpler method than those suggested above. In each maturity indicator there is an undoubted, though quantitatively un-known, association between increase in size, even if measured by a single axis, and change of shape in the adult direction. Therefore two questions were posed:
can the measurements be best combined to provide estimates of the skeletal ages given in the atlas?
2. If this measurement technique is ap-plied to another series of children’s roent-genograms, how do the “measurement
ages” compare with ages estimated by
in-spection, the “inspectional ages”?
Obviously, such a simple technique
would be weakest in (1) the first few years of life, before all the required indicators have appeared, and (2) in adolescence, when the criteria of maturation are events at the diaphyseo-epiphyseal junction. How-ever, as Todd1 pointed out, it is the inter-vening 10 years that present the greatest difficulty in his method, and a substitute for inspection would there be specially helpful.
Atlases
MATERIALS AND METHODS
Todd’s atlas;2 also “intermediates,” i.e., films printed from the original roentgenograms used
for Todd’s atlas, and obtained long ago through
the courtesy of Drs. Todd and Francis. The
Greulich-Pyle atlas.3
Test Roentgenograms
The three series of postero-anterior
roent-genograms, with inspectional assessments by
Mrs. Mainland (RBM) used for previous re-ports.5’6
1. The Macy Series: Actual-size reproduc-tions of left hands in Macy’sT Nutrition and
Chemical Growth in Childhood; chiefly 46 films
from eight boys, ages 4 to 14 years (22 films from seven girls were also used). Assessments
of these films by Todd, Francis and Pyle
(re-ferred to here as “Todd’s own” assessments) are recorded by Macy.
2. The Orphanage Series: Actual films of the
right hands of 74 boys and 61 girls, ages 5 to 149.i2 years, one film per child, from St. Joseph’s Orphanage, Halifax, Canada.
3. The Nutrition Series: Actual films of the
right hands of 17 boys, ages 3 to 79’12 years,
and 23 girls, ages 21 months to 7 years, one or
more films per child, obtained in a Halifax
Nu-trition survey.
Series 2 and S were all prepared by RBM in the Dalhousie University Anatomy Depart-ment. Technique: No-screen films on table top;
target-film distance 40 in.; 100 ma; 0.25 see-ond; kv.p. 40 to 41, occasionally 42, depending
on thickness.
Desiderata of Measurement Methods
In the devising and testing of methods, two
points were kept in mind : (1) For routine
clinical use, the simplest adequate method
would be desirable. A more complicated method, if decidedly superior, would be
de-sirable in research. (2) The purpose of skeletal
age assessment is either (a) study of an
mdi-vidual child, or (b) study of a group, e.g., in a nutrition survey or a dietary experiment. In
either case, the information required may be
(
i) present status, or (ii) progress over a periodof time.
The exploration was not intended to be
haustive of indicators, or of dimensions in any
indicator, or of methods of combining the measurements. It was an attempt to see if the method held any promise, and to distinguish between the more promising and the less
prom-ising techniques, for the benefit of other
ex-plorers in current or future atlases.
Maturity Indicators
Radius, capitate, epiphysis of metacarpal III, epiphysis of proximal phalanx of digit III. In the first three, oblique axes were measured in order to incorporate features that are used for
inspectional assessment, e.g., the development
of the styloid process of the radius and the angulation of the capitate. With every
ma-turity indicator another reading was made, to be used in adjusting for the variable sizes of
hands-shaft widths for use with epiphyses,
carpal length for use with capitate.
Dimensions Measured (Fig. 1)
Radius-epiphysis, from proximo-medial to
disto-lateral angle; shaft, width of distal end.
Capitate-from proximo-lateral to disto-medial angle; carpal length, from distal end of shaft of radius to proximal end of shaft of meta-carpal III. Metacarpal Ill-epiphysis, from proximo-lateral to disto-medial angle; shaft,
width of distal end. Proximal phalanx of digit
Ill-epiphysis, width; shaft, width of proximal
end.
All measurements were made by Mrs. H. H. Wertheimer on films (or pictures) protected by
transparent cellulose 0.01 in. thick, with fine
ARTICLES
Fic. 1. Outline of
roentgeno-gram of hand, to show axes
which were measured. R =
radius. C = capitate. M = meta-carpal. P = phalanx. The
verti-cal line from radius shaft to
metacarpal III represents
“carpal length.”
reading to 0.01 cm. In each series of films
the reading order was assigned by random
numbers.
Estimation of Age by Moving-Average
Curves
Example: Radius epiphysis in Todd’s atlas
(male standards). A dot diagram was formed
by plotting age (Y) against epiphyseal size (X).
The upward trend was not uniform, and at
some points depressions occurred, i.e., a larger
bone corresponded to a younger age, even
when the adjustment for hand size (discussed
later) had been applied. To avoid complex
mathematical curve fitting, the moving-average method was used as follows: Arrange
epiphy-seal sizes in ascending order of magnitude. For S-point smoothing, find the mean epiphyseal
size and the mean age for Nos. 1, 2 and 3, and
plot mean age against mean size. Drop No. 1, add No. 4, and repeat the process throughout
the series.
Three-point curves were still very irregular;
therefore more points were included, and for
most bones in both atlases 5-point smoothing sufficed to remove the depressions. Exceptions (in Todd’s atlas): Metacarpal epiphysis in males (7 points), capitate in females (8 points). (For additional points, at the beginnings and ends of the curves, smaller numbers of points, down
to 3, had to be used.) From the moving
aver-ages, tables were prepared for each of the four
indicators in each sex and atlas (Todd atlas and intermediates separately).
The skeletal age of any film, say in the Orphanage Series, was estimated by following these directions: Take the dimension of each
of the four maturity indicators to its
appropri-ate table and find the corresponding age by
linear interpolation. Summate the ages and
divide by 4. (This gives equal weight to each
indicator. Unequal weighting will be discussed
later.)
Adjustment for Hand Size
It was naturally assumed that an adjustment
or correction of the size of a maturity indicator
by a factor representing general hand size, or more specifically bone size (e.g., an adjacent
shaft) would be very important. Three such
ad-justments were tested, chiefly on the radius
epiphysis in Todd’s intermediates for boys: (1)
The ratio of epiphysis size to shaft width was
clearly inappropriate because in the older boys
it did not vary consistently with age. (2) The subtraction of shaft width from epiphysis size was, if anything, less satisfactory. (3) An
ad-justment, which need not be described in
de-tail, was derived from linear regression equa-tions showing the relationship of epiphysis sizes to age and shaft width independently.
(For capitate, the corresponding adjustment
was obtained from carpal length.) By the
moving-average method described above, the
adjusted measurements provided estimates of
skeletal age. However, the fit of the actual
(atlas) ages to these curves was poorer (when
measured by the sum of squares of deviations) than the fit obtained with the unadjusted measurements. Moreover, in the films of boys
in the Macy Series the adjustment gave no
better agreement with Todd’s own assessments
982
Therefore, in all the rest of the work with moving-average estimates the unadjusted sizes
were used.
Weighting of Maturity Indicators
In the inspectional estimate, when each
in-dicator is assessed separately, the final estimate is usually an average, giving equal weight to
each indicator. In the moving-average
esti-mates from bone sizes, two methods of
un-equal weighting were tested on Todd
inter-mediates (male standards). For each maturity
indicator, the differences between the
moving-average estimates and the actual (atlas) ages
were expressed as a variance (mean-square-de-viation). Then, in finding the average age for
each film in the series of intermediates, the indicators were weighted (1) in reverse order
to the corresponding variances, and (2) in
proportion to the reciprocals of their variances.
The weighted-average estimates of age were
each compared with the equal-weight estimates
by finding which varied more (by sum of
squares of deviations) from the actual (atlas)
ages. Method (2) produced larger variation than did equal-weight estimates, and Method
(1) did not show sufficient superiority over
equal-weight estimates to justify using it.
In all three test series of films, the ages
esti-mated from each of the four individual indi-cators were compared with inspectional assess-ments, both by numerical differences and by
the order of proximity to the inspectional ages.
No indicator gave consistently better or worse results than the others. Therefore, in the rest
of the work equal-weight averages of the four
indicators were used.
Estimation of Age by Multiple
Regression Equations
From 2 years to over 15 years, in boys, the age-size relationship of any one indicator could
be represented very largely by a straight
(linear) regression equation, which would pro-vide estimates of skeletal age (Y) from bone
size (X). By multiple regression, any number
of indicators could be introduced, each
re-ceiving automatically its appropriate weight; and the equation could include adjustments for
individual hand size (e.g., width of radius
shaft).#{176}
#{176}It should be mentioned that this use of
re-gression differs from the customary form, in which
From measurements on Todd’s
intermedi-ates, Male Standards H6 (21 months) to H34
(
1592 years) inclusive, the equation was as follows (age in months; bone sizes in cm; ep.= epiphysis; met. = metacarpal III; phal. =
proximal phalanx of digit III:
Age = 34.82 (radius ep.) + 22.67 (capitate) + 12.57 (met. ep.) + 20.74 (phal. ep.)
-18.43 (phal. shaft) - 26.02
The phalanx shaft appears in the equation
as an adjustment for general bone size,
be-cause, after the age-epiphysis relationship had been eliminated, the correlation between
phalanx shaft width and age was significant#{176}
(r = -0.450; P 0.01 approx.); but the radius
shaft does not appear, because the correlation between it and age, after elimination of the epiphysis, was not significant (r = -0.273;
P between 0.2 and 0.1). For similar reason, the metacarpal shaft was omitted. The carpal length could have been used because its cor-relation with age, even after elimination of capitate size, was +0.481 (P between 0.02 and 0.01), but it was omitted because it does not
clearly contribute to inspectional assessments,
whereas the phalangeal shaft-epiphysis rela-tionship is considered in those assessments.
The equation containing the same variates, derived from Todd’s intermediates, Female
Standards H5 (15 months) through H33 (153, 2
years) was:
Age = 56.92 (radius ep.) + 7.55 (capitate) +
10.34 (met. ep.) + 2.88 (phal. ep.) -1.56 (phal. shaft) - 47.15
#{176}Throughout the study, the minimal standard for statistical significance was the 5 per cent level (P less than 0.05).
the dependent variate Y (here, age) is not selected and is, at least approximately, distributed normally (i.e., in Gaussian form) at each value of the inde-pendent variate X (here, bone size). These condi-tions do not hold here, because ages were selected in forming the atlases. Hence it is inappropriate to test the equations against the data from which they were derived, for such tests depend on nor-mally distributed residual variation (Gaussian
scatter of Y values about the regression line). In
this study the tests were made against inspectional estimates on other films (the test series).
RESULTS
Status of Groups of Children
MEAN VALUES IN ORPHANAGE SERIES (135
CHILDREN, ACES 5 6/12 TO 14 6/12 Yws): Table I, showing the average retardation of skeletal age behind chronologic age, re-veals four noteworthy facts:
1. In boys the mean retardation was simi-lar (5% to 7% months) whether the ages were estimated by measurement (regression or moving-average) on Todd’s intermediates, by inspection of Todd’s atlas, or by inspec-tion of Greulich-Pyle atlas; and in item V the mean inspection-measurement differ-ence (0.22 month) is very small and far from significant.
Accepting these figures as representative of wJat would be met in a survey of boys, in this age range, assessed by measurement, we could use twice the SD of the mean difference in item V (i.e., 2 X 1.06 2.12 months = 9 weeks) and assert with
con-siderable confidence (about 95% probability) that the mean inspectional age would not have differed from our measurement esti-mate by more than about ±9 weeks.
With a larger group the confidence in-terval would be narrower. Thus, for a group of 400 boys, and SD of series 9.11 months as in item V, the SD of the mean difference would be 9.11/J400 = 0.46 month, and
the interval would be ±0.92 month = ±4
TABLE I
DIFFERENCES BETWEEN CHRoNouxnc AGES AND SKELETAL AGES ESTIMATED BY DIFFERENT
STANDARDS AND METHODS-ORPHANAGE SERIES
Standards and Methods Sex
Skeletal Ag
Mean
e Minus Chrono
SD of Series
kgic Age (mo)
SD of Mean
Todd’s Standards
I. Meas’t., intermed., regr. B
G
-6.00 -6.49
13.86
11.83
1.61
1.52
II. Meas’t., intermed., mov. av. B -6.22 14.89 1.73
III. Meas’t., atlas, mov. av. B
G
-10.01
-11.05
14.18
12.14
1.65
1.56
IV. Insp., atlas, 1st reading B
G
-5.78 -3.92
12.13
10.78
1.41
1.38
V. Duff., IV-I B
G
+0.22 +2.57
9.11 10.42
1.06
1 33
VI. Duff., insp., lst-2nd reading B -0.18 5.06 0.5t
Greulich-Pyle Standards
VII. Meas’t., atlas,mov. av. B
G
-18.04
-18.72
13.75
12.53
1.60
1.61
VIII. Insp., atlas, 1streading B G
-7.59 -6.74
14.13
10.30
1.64
1.32
IX.#{176}Duff., VIII-Vil B
G
+10.45 +11.98
9.10 9.11
1.06 1.17
* Items V and IX are equivalent to skeletal age by inspection minus skeletal age by measurement.
B= 74 boys (66 to 175 mo); G = 61 girls (65 to 171 mo); Meas’t. = measurement; Insp. = inspection; intermed.
weeks, less than half the possible error for the Orphanage sample of 74 boys.
2. In girls the mean difference in item
V (2.57 mo) is larger than in boys, but not quite twice its standard deviation; therefore not quite at the 5% level of significance.
3. In boys and girls the retardation
esti-mated by measurement (moving-average,
item III) is 10 to 11 months by Todd’s atlas. That is, measurement on the atlas gave
es-timates of mean skeletal ages 4 to 5 months
lower than did measurements on the
inter-mediates (items I arid II) although the atlas
pictures were reproductions of the
inter-mediates.
This difference could be accounted for
by the larger shadows in the atlas-a
mag-nification that must have occurred in repro-duction. A slight difference can produce a
considerable effect; e.g., if each bone in the
male equation (p. 982) were 0.05 cm larger, the estimated age would be 3.6 months greater. In boys of the Orphanage age range, the differences, atlas minus
inter-mediates, ranged from + 0.012 cm (in a phalanx epiphysis) to + 0.067 cm (in a
radius epiphysis). When the ages of the
atlas films (male standards) in this age range were estimated from the equation for
in-termediates (p. 982), they were found to
be higher than the actual (atlas) ages by a mean value of 4.15 months, which is very
similar to the difference found in the Or-phanage boys (Table I, item I-item III
+ 4.01 mo).
4. In boys and girls assessed on
Greulich-Pyle standards the mean measurement age
was 10 to 12 months lower (item IX) than
the mean inspectional age. This again could
be attributed to enlargement of images dur-ing reproduction, because, when the skeletal
ages of the Greulich-Pyle atlas films, in the
Orphanage children’s age range, were es-timated by the equations from Todd’s in-termediates they were found to exceed the
actual (Greulich-Pyle) ages by mean values
of 8.27 months in boys and 14.43 months in girls.
M VALUES IN NUTRITION SERIES (40
CHILDREN, ACES 21 MONTHS TO 7 6/12
YEARS) : This series (Table II) contained
mostly children younger than those in the
Orphanage Series (Table I) and therefore
revealed, as was expected, one of the
weak-nesses of the measurement method-its
in-ability, in its present form, to make proper
allowance for absent maturity indicators.
Many of the children in the Nutrition Series
provided more than one film, separated by
varying intervals, but for Table II only one
film was used from each child-the earliest,
except where indicators were missing. The
estimates of mean retardation in Table II
are biased, therefore, as a description of
the group surveyed, but are useful for the
comparison of methods. The table reveals
four interesting features:
1. The two inspectional methods (items II and V) by the Todd and Greulich-Pyle
atlases are in fair agreement.
2. For boys and girls assessed by the
Greulich-Pyle atlas the mean measurement age (moving-average method) was signifi-cantly lower than the mean inspectional age
(item VI; P less than 0.01), but the
differ-ence was less than in Table I (item IX).
3. By one measurement assessment the
boys were given a higher skeletal age than
by inspection (items I, II and 111)-an
ex-ception to the general rule here and in Table I. No explanation was found.
4. Measurement on Todd’s intermediates
(item I) gave skeletal ages higher than
Greulich-Pyle measurements (item
IV)-mean differences of 9.4 months in boys and
1.4 months in girls. This again was largely
attributable to enlargement of images in
reproduction, for estimation of ages for the
Greulich-Pyle atlas films (in the Nutrition
children’s age range) by the equations from the Todd intermediates gave rather similar
mean differences: 10.2 months in boys and
3.1 months in girls.
MEAN VALUES IN MACY SERIES (46 FILMS
FROM 8 Boys, ACES 4 TO 14 YEARS):
Com-parisons of assessment methods revealed
two points of interest:
TABLE II
DIFFERENCES BETWEEN CmionoLoolc AGES AND SKELETAL AGES ESTIMATED BY DIFFERENT
STANDARDS AND METHODS-NUTRITION SERIES
Standards and Methods Sex
Skeletal Age
Mean
Minus Chronologic
SD of Series
Age (mo)
SD of Mean
Todd’s Standards
I. Meas’t., intermed., regr. B
G
+0.59 -7.17
16.81 18.76
4.08 3.91
II. Insp., atlas B -5.41 18.38 4.46
G -2.39 13.78 2.87
III. Diff., Il-I B -6.00 6.64 1.61
G +4.78 17.16 3.58
Greulich-Pyle Standards
IV. Meas’t., atlas, mov. av. B
G
-8.82
-8.61
15.28 9.46
3.71
1.97
V. Insp., atlas B -5.24 17.51 4.25
G -3.13 12.04 2.51
VI. Diff., V-IV B
G
+3.59
+5.48
4.33 4.44
1.05 0.93
B= 17 boys (42 to 79 mo); G = 23 girls (Q1 to 83 mo); Meas’t. = measurement; = Todd’s intermediates; mov. av. = moving-average: regr. = regression.
Insp. = inspection; intermed.
by three methods (regression on
inter-mediates, moving-average on intermediates,
moving-average on atlas) were compared,
film by film, with Todd’s own inspectional
assessments. Each of the three
measure-ment methods produced a significant ma-jority of films with higher age estimates than corresponding Todd estimates, i.e., a significant majority of positive values of the difference, measurement minus Todd. However, boys differed significantly from each other in this inspection-measurement
relationship, just as they did in the com-parison of RBM’s inspectional assessment with Todd’s assessment.5
2. Comparison of the three measurement methods with RBM’s inspectional
assess-ments showed also a significant
preponder-ance of positive values of the measurement
minus inspection difference. In view of the frequent agreement of measurement with RBM’s inspection in the Orphanage and
Nutrition Series, perhaps the explanation
was enlargement of the Macy film images during reproduction. Even “actual-size” re-production can hardly be expected to be exact within 0.01 cm.
INTER-CHILD VARIATION IN THE
ORPHAN-ACE SERIES: In Table I (items I-IV, VII and
VIII) the SD’s of the series express inter-child variation in the relationship of skele-tal age to chronologic age, and the es-timates are similar in magnitude, which-ever standard or method (inspection or measurement) was used: 12 to 15 months
in boys, 10 to 14 months in girls. In the
Nutrition Series (Table II, items I, II, IV and V) the differences are rather greater, especially in girls.
The frequency distribution of retarda-tion (and acceleration) in any population should not be presumed Gaussian in form; therefore it is unsafe to compare SD’s by
es-TABLE III
Meas’t. = measurement; Insp. = inspection; mtermed. = Todd’s intermediates; mov. av. = moving-average; regr.
= regression.
timate the skeletal ages of a group of chil-dren both by measurement and by
inspec-lion, and then we compare the inter-child variation in skeletal age with the variation found in standard (healthy) children, how do the verdicts of the two methods differ?
PERCENTILE DISnuBUTIoNs OF
INTER-CHILD VARIATION: The standard series
re-quired to answer the above question-West-em Reserve (Brush Foundation) children on whom Todd developed his standards-is provided by the Introduction to the Greulich-Pyle, in the form of means and standard deviations for skeletal age in chronologic age groups of 1 year (less than 1 year for younger children). The dis-tributions (both male and female) appar-ently were considered sufficiently Gaussian by Greulich and Pyle to justify the use of multiples of the SD’s in setting off upper and lower limits of “normal,” and even if
this is not fully justified it will not vitiate
the comparisons to be made here. There-fore the mean
±
1.2816SD was used to es-timate the 10th and 90th percentiles, and the mean ±1.96SD was used for the 2.5 percentiles.For the chronologic age of each child its estimated skeletal age was located by reference to these percentiles, and in Table III (Orphanage Series) the boys’ and girls’
data were pooled because there was no significant difference between their
fre-quency distributions. As in assessing body
weight, probably the 10th percentile is of most interest as the boundary between children who can be accepted as not “ab-normally” retarded and those who should be “viewed with suspicion.” Using the 10th percentile we observe three noteworthy fea-tures:
1. Assessment by measurement (Todd’s intermediates, items I and II) showed ap-proximately the same proportion of chil-dren below the 10th percentile as did
in-spectional assessment (Todd and
Greulich-Pyle atlases, first readings, items IV and
VII).
2. There was closer agreement between
measurement assessment (Todd’s inter-mediates, items I and II) and inspectional assessment (1st readings, item IV) than there was among the inspectional assess-ments themselves (items IV and V).
3. Measurement assessments by the atlases (items III and VI), especially the
- Greulich-Pyle atlas, differed from
inspec-tional assessments (items IV, V, VII and VIII) and from measurement assessments by the intermediates (items I and II)-another manifestation of the discrepancy found in examining the mean values, and
DIFFERENCES (SKELETAL AGE Minus CHRONOLOGIC AGE) IN ORPHANAGE SERIES (135 CHILDREN, 65 TO 175 Mo)
CLASSIFIED BY PERCENTILES DERIVED FROM STANDARD (WESTERN RESERVE3) CHILDREN
Standards and Methods
Percent
Below 10th
i/es from Standard
10th to 90th Series
Above 90th
Todd’s Standards
I. Meas’t., interined., regr. 44.5% 51 .9% 3.7%
II. Meas’t., intermed., mov. av. 43.2% 51.4% 5.4%
III. Meas’t., atlas, mov. av. 59.5% 36.5% 4.1%
IV. Insp., atlas, 1st reading 43.0% 55.6% 1.4%
V. Insp., atlas, 2nd reading 37.0% 61.5% 1.5%
Greulich-Pyle Standards
VI. Meas’t., atlas, mov. av. 81.5% 17.8% 0.7%
VII. Insp., atlas, 1st reading 47.4% 51.1% 1.5%
ARTICLES
attributable to magnification during repro-duction of films for the atlases. Similar classification of the Nutrition Series (one film from each of 45 children) showed, be-low the 10th percentile: By inspectional
assessment, 35.6%; by measurement, 60.0%.
Status of Individual Children
Although Table I shows agreement among the mean skeletal ages obtained by inspectional assessment and those obtained
by measurement of Todd’s intermediates,
item V reveals the lack of agreement on
individual films. The frequency distribution of the individual differences (IV-I) was roughly Gaussian. Therefore we can double the SD’s (9.11 and 10.42) and then, after estimating a particular child’s skeletal age
by measurement, we can assert that
prob-ably the corresponding inspectional
assess-ment would lie within
±
18 to 21 months of our measurement assessment, but we can-not (with 95% probability) be more definitethan that. The following additional details deserve attention:
1. In item IX, although there is a large mean difference, the individual variation
is about the same as in item V (2SD = ±18
months).
2. There was a correlation (+0.455 in the 74 boys) between the chronologic age and the difference (inspectional minus
measurement age, item V), but this would reduce the SD (9.11 in item V) by only about 1 month.
3. Part of the individual variation (item V) was due to the variation (variable error) between inspectional assessments
them-selves (item VI), which has a 2SD range of
±10 months, as shown previously.6 By con-trast, replicate measurement assessment (moving-average on Greulich-Pyle atlas) of 83 Orphanage Series films shows a 2SD range (variable error) of ±2.31 months.
In the Nutrition children (Table II, item VI) the 2SD range of differences between measurement age and inspectional age was below ±9 months for both boys and girls assessed by the Greulich-Pyle atlas, but the sexes differed greatly in this variation when
assessed by the Todd atlas (item III). This was mainly due to one girl’s film, assessed as 73 months by inspection and as 5 months by measurement (chronologic age, 51 months)-an example of the unreliability, in the youngest children, of the measure-ment method as developed at present.
DIFFERENCES IN PERCENTILE LOCATION:
Of probably greater practical importance than the absolute variation, in months, be-tween measurement and inspectional ages, are two questions: (1) What proportion of children would be located differently, by reference to the standard children’s 10th percentile, by the two methods? (2) How
does this difference compare with the
differ-ence between replicate inspectional assess-ments of the same film?
For 135 Orphanage children assessed on Todd’s atlas by inspection (each child twice independently, by RBM) and by measure-ment (regression method on intermediates), the figures were:
Of 77 children located above the 10th percentile by inspection (1st reading), 16
(21%) were located bekw the 10th
percen-tile by measurement, whereas only 3 (4%) were so located by the second inspectional reading.
Of 58 children located below the 10th percentile by inspection (1st reading), 14 (24%) were located
above
the 10th percen-tile by measurement; but 11 (19%) were so located by the second inspectional reading.Therefore, when a measurement method does not differ systematically (i.e., on the average) from the corresponding inspec-tional method, there seems to be no strong evidence in favor of inspectional assessment of individual children.
Progress of Groups of Children
PROGRESS IN THE MAcY SERIES: From each
of eight boys, two or more films were avail-able, and the progress in chronologic and skeletal age was estimated between conS
secutive films, e.g., first and second, third
and fourth (total film-pairs = 22). In the
boys differed from each other in this
re-spect; therefore the 22 film-pairs could be taken to represent one pair from each of 22
boys, e.g., in a dietary experiment (except
that the observation period would be alike in all children).
The mean chronologic gain was 13.09 months (range, 2 to 37 months) and the estimates of mean skeletal age gain were as follows (Intermed. Todd’s
intermedi-ates; regr. = regression method; mov.av. moving-average method):
Todd’s standards, measurement-Intermed., regr., 15.14 months. Intermed., mov. av., 15.77 months. Atlas, mov. av., 15.32 months. Todd’s standards,
inspection-Todd’s own assessments, 15.45 months. RBM, 1st reading, 17.14 months. RBM, 2nd reading, 15.95 months. Greulich-Pyle standards,
measurement-Atlas, mov. av., 15.77 months. Greulich-Pyle standards,
inspection-RBM, 1st reading, 13.86 months. RBM, 2nd reading, 14.82 months.
Ana!’sis of variance showed that the differences among these mean values were far from significant, and, although a much larger sample might reveal a significant difference, there seemed to be little likeli-hood that this would occur in some of the contrasts. Thus, the mean difference, meas-urement (regression on intermediates)
minus Todd’s own assessments, = -0.31
months. SD of mean diff. = 1.90 months.
t = 0.16. P is between 0.9 and 0.8.
Accept-ing, therefore, the :eal mean difference as zero, we have 2SD = ±3.80 months. That
is, in about 95% of surveys, with 22 boys,
the mean progress estimated by measure-ment ages could be expected to be less than 4 months above or below the estimate made by Todd.
This precision would hardly suffice, but if there were 100 boys in a survey, and if
the variation among measurement-inspec-tion differences remained as in the Macy Series, the 2SD range would be less, be-cause the SD of a mean is proportional to
1/JN. In fact, 2SD would be ±1.78 months.
Other estimates, provided by the Macy
Series of boys, were compared with estimates showing the variability of the inspectional
method itself:
Todd’s standards-measurement (regr. on intermed.) minus inspection (RBM): SD of
mean duff. = 2.15 months. If N were 100, 2SD
would be ±2.02 months.
Todd’s standards-inspection (RBM) first
minus second reading: SD of mean duff. = 1.39
months. If N were 100, 2SD would be ±1.30
months-a somewhat smaller variability than
between measurement and inspection.
Greulich-Pyle standards-measurement (:nov. av.) minus inspection (RBM): SD of mean diff.
= 1.32 months. If N were 100, 2SD would be
±1.24 months.
Greulich-Pyle standards-inspection (RBM)
first minus second readings: SD of mean duff.
= 1.54 months. If N were 100, 2SD would be ±1.44 months.
Estimates from the Macy Series of seven girls (1 1 film-pairs) on Greulich-Pyle standards were:
Measurement (mov. av.) minus inspection
(RBM): SD of mean duff. = 1.83 months. If
N were 100, 2SD would be ±1.21 months. Inspection (RBM) first minus second
read-ing: SD of mean diff. = 2.04 months. If N
were 100, 2SD would be ±1.35 months.
In both boys and girls (on Greulich-Pyle
standards) therefore, the measurement es-timates agreed as closely with the inspec-tional estimates as did the two inspectional estimates with each other.
PROGRESS IN IE NUTRITION SERIEs:
Twenty-eight children each provided a pair of films (intervals: 4 to 13 months) which were assessed by Greulich-Pyle standards. Chronologic mean gain = 6.61 months.
Es-timated mean skeletal age gains-measure-ment (mov. av.), 5.11 months. Inspection (RBM) first reading, 5.61 months; second
reading, 6.11 months. The differences were
not significant.
Measurement (mov. av.) minus inspection (RBM, first reading): SD of mean duff. =
0.55 months. If N were 100, 2SD would be ±0.58 months.
If N were 100, 2SD would be ±1.23 months.
Again, the relationship of the measure-ment estimate to the inspectional estimate
seems closer than the relationship of the
two inspectional estimates to each other.
Progress of Individual Children
The mean differences discussed in the previous section were derived from
mdi-vidual differences, one from each film-pair, of the form : gain in skeletal age estimated by measurement minus gain in skeletal ageestimated by inspection. These individual
differences varied much among the
film-pairs, whichever atlas standards were used. For example, on Greulich-Pyle standards, in the 22 film-pairs from Macy boys the differences ranged from -11 months to
+ 19 months; and in the 28 film-pairs from
Nutrition children they ranged from -5 to
+7 months.
However, in the same film-pairs, also on
Greulich-Pyle standards, the differences be-tween inspectional estimates of age gain (first reading minus second reading) had, if
anything, a wider range: Macy boys, -18
to + 19 months; Nutrition children, -13 to +13 months.
CONCLUSIONS AND DISCUSSION
The foregoing results seem to justify the following conclusions:
1. By using simple measurements of a
few maturity indicators (epiphyses and
car-pals) it is possible to obtain estimates of
skeletal ages that are equivalent to
in-spectional estimates in the group-study of children between the age when ossific cen-ters have appeared and the age of incipient
epiphyseal fusion, for the purpose of
com-paring groups with reference to (a) average
skeletal ages, (b) inter-child variation in
skeletal age, and (c) average progress
(average gain in skeletal age).
2. Because the preparation of atlases causes a change in the sizes of shadows, even in “actual-size” reproduction, the measurement method, unless based on in-termediates, does not permit reliable
com-parison, even of groups, with standard (at-las) children. It is, however, not at all clear that the measurement method is any less reliable in this respect than the inspectional method; and, as suggested below, the
im-pediment can be removed.
3. In the assessment of the skeletal age status or progress of individual children, the differences between the measurement estimate and the inspectional estimate vary so greatly from one film to another that the
measurement method seems to be of little
use; but replicate inspectional assessments of the same film by the same observer also vary so much that the assessment of an mdi-vidual child is very crude.
The simplest way for an observer to apply the measurement method is to produce, from his chosen atlas, his own moving-aver-age tables for each maturity indicator, as
described in this report. Measurement to
the nearest 0.01 cm is desirable, and a les-son learned in the present study should be mentioned. After the measurement assess-ment of Orphanage films on Greulich-Pyle standards had been made, many other films were measured with the same dividers ap-plied to the same steel scale, and then the
Orphanage films were remeasured. The
sec-ond measurements were slightly smaller than the first, and the mean difference in skeletal age (first reading minus second reading in 83 films) was -0.826 months, SD of mean = 0.126, t = 6.56, P less than
0.001. This difference was apparently due to the wearing of the zero scale division, at which one leg of the dividers was placed while the zero of the vernier was aligned with the other leg. (A cellulose protector might be the easiest way to avoid this risk.)
990
to find the correlation between age and the size of each (proximal and middle) epiphy-sis separately, and also the correlation be-tween the two epiphyses. Then, if elimina-tion of the proximal epiphysis (by partial correlation) reduced the correlation be-tween age and the middle epiphysis to a
nonsignfficant value, it could be concluded that the middle epiphysis would contribute little or no independent information to the age estimates that contained the proximal epiphysis.
The greatest difficulties in this study arose at the two ends of the age span-(a) before all the ossific centers had appeared, and (b) when, in some Orphanage girls, certain of the indicators (especially the capitate) exceeded in size the largest in-corporated in the formulae derived from the atlas. Carpal assessments are obviously necessary, however, and, if the study were to be repeated, more attention would be paid to carpal length (between the ends of shafts of radius and metacarpal III) as a possible substitute for measurements of individual carpal bones.
The regression method of utilizing the measurements has three advantages over the moving-average method: (1) It auto-matically assigns appropriate weights to the various indicators. (2) It permits ad just-ments (correction terms) for general or local bone size. (3) It does not suddenly weaken near the two ends of the age span, whereas the moving-average method de-pends on a rapidly decreasing number of points.
The two main disadvantages of the re-gression method are not insuperable: (1) Arithmetically it is somewhat complex and laborious; but an intelligent and accurate student who can use a calculating machine finds little difficulty in following arithmeti-cal instructionsS for solving simultaneous equations. (2) The equations used in this
study represent only straight-line
relation-ships between age and bone size, whereas the moving-average method allows for curv-ilinearity to some extent, although not very
greatly when five or more points are
aver-aged. Further research might provide a
mathematically-fitted line of the growth-curve type.
A final suggestion may be made. Anyone who has prepared a standard series of films, for a current or future atlas, could
- publish a suitable age-by-measurement
equation (or tables derived therefrom) along with explicit instructions regarding x-ray technique (including centering and target-film distance) and regarding meas-urement techniques. He could publish some
clearly reproduced films with a statement
of the bone dimensions as measured by him
(
on the reproductions) so that any assessor could avoid significant systematic error bypractice and by rechecking himself from
time to time.
If this suggestion were followed it would not only relieve an assessor of developing a moving-average curve or an equation; it would enable him to compare a group of children, not only with another group meas-ured by him, but directly with the group of standar.d children. It would, moreover, greatly reduce the cost of atlas production.
SUMMARY
The inspectional method of estimating children’s skeletal ages from an atlas of standard films is based on the fact that in-crease of size (e.g., of a carpel bone or
epi-physis) is not equivalent to maturation. The deficiencies of the inspectional method, however, render measurement methods
very desirable.
After ossific centers have appeared, fur-ther changes (development or maturation) are largely expressible as changes in shape or proportions, and are therefore measur-able; but, to avoid the complexity of ex-pressing shape by measurement, the pres-ent study started from the premise that change in shape is associated (although to an unknown degree) with increase in linear dimensions.
meta-ARTICLES
carpal III epiphysis, and the epiphysis of the proximal phalanx of digit III.
To express the relationship between age and the size of each indicator,
moving-average curves were developed (usually by
5-point averages). To obtain a
“measure-ment estimate” of the skeletal age of any film, the age estimates derived from the curves for the four indicators were aver-aged, giving equal weight to each indicator.
For the Todd intermediates a more com-plicated method of estimating age from in-dicator sizes was also used-a multiple
re-gression equation, which automatically
al-lotted an appropriate weight to each mdi-cator and permitted adjustment for general hand (or bone) size where required (phalanx shaft width as an adjustment for epiphysis width).
The ages estimated by measurement
(
moving-average and regression methods)were compared with inspectional estimates, from Todd and Greulich-Pyle atlases, on three series of roentgenograms comprising a total of more than 250 films from 190 chil-dren of ages 2 to 14% years. The conclusions were:
1. For the group-study of children be-tween the age when ossific centers have
ap-peared and the age of incipient epiphyseal union the measurement estimates of skele-tal age would be equivalent to inspectional
estimates in comparing (a) average skeletal
ages, (b) inter-child variation in skeletal age, and (c) average progress (average gain in skeletal age).
2. Because even “actual-size” reproduc-tion in the preparation of an atlas causes a change in size, the measurement method,
unless based on intermediates, does not per-mit reliable comparison, even of groups, with standard (atlas) children. This draw-back could be removed by publishing, in-stead of an atlas, tables of measurement-age equivalents derived from current or future series of standard children’s roentgeno-grams (actual films or intermediates) with some reproductions for guidance in meas-urement.
3. In the assessment of an individual child’s skeletal age status or progress the
differences between inspectional and meas-urement ages vary so greatly from film to film that the measurement method seems to be of little use; but for this purpose the inspectional method itself is very crude.
ACKNOWLEDGMENTS
The author wishes to thank Mrs. Ruth B. Mainland for the roentgenography and in-spectional assessments, Mrs. Helen H.
Wertheimer for the measurements on the
films, Miss Carol Gertzis, Miss Ruth M. Smith and Miss Elisabeth Street for statis-tical analysis and valuable constructive criticism.
REFERENCES
1. White House Conference on Child Health
and Protection; Growth and Development
of the Child. Part II. Anatomy and
Physi-ology. New York, Century, 1933. 2. Todd, T. W. : Atlas of Skeletal Maturation
(Hand). St. Louis, Mosby, 1937.
3. Greulich, W. W., and Pyle, S. I. : Radio-graphic Atlas of Skeletal Development of
Hand and Wrist. Stanford, California,
Stanford Univ. Press, 1950.
4. McCloy, C. H. : Appraising Physical Status:
Methods and Norms. Univ. Iowa Studies
in Child Welfare, 15:No. 2. Iowa City, Univ. of Iowa, 1938.
5. Mainland, D.: Evaluation of the skeletal age method of estimating children’s de-velopment. I. Systematic errors in assess-ment of roentgenograms. PEDIATRIcs, 12: 114, 1953.
6. Mainland, D.: Evaluation of the skeletal age method of estimating children’s de-velopment. II. Variable errors in assess-ment of roentgenograms. PEDIATRIcs, 13: 165, 1954.
7. Macy, I. G.: Nutrition and Chemical Growth in Childhood, Vol. 2. Springfield, Thomas, 1946.
8. Snedecor, G. W.: Statistical Methods
Ap-plied to Experiments in Agriculture and
Biology. Ames, Iowa, Iowa State College Press, 1956.
SUMMARIO IN INTERLINGUA
Evalutation del Etate Skeletic
Post le apparition de centros ossific, le dis-veloppamento del manos de pueros, vidite in
roentgenogrammas, es in grande mesura un
alteration del conformation e del proportiones
del ossos, e isto es correlationate a un certe
de iste factos, con le objectivo de evitar
ali-cunes del defectos de evalutationes de
roent-genogrammas super le base del reproductiones in le atlantes standard (de Todd e de
Greulich-Pyle) e del intermediarios de Todd (i.e. de
reproductiones filmate del roentgenogrammas
usate in le atlante), simple mesurationes linear esseva obtenite de quatro indicatores de maturitate: le epiphyse del radius, le osso capitate, le epiphyse del osso metacarpal III,
e le epiphyse del phalange proximal de digito
III. Pro exprimer le relation inter etate e le dimension de cata un del indicatores, curvas de
mobile valores medie esseva disveloppate. Postea, pro obtener estimationes de etate pro
un roentgenogramma particular, le mesuration
de cata un del indicatores esseva transformate in le etate correspondente per le uso del curva
appropriate. Le quatro etates assi obtenite
esseva reducite a br valor medie. Pro
inter-mediarios, un equation de regression multiple
esseva disveloppate.
Le etates estimate super le base de mesura-tiones esseva comparate con estimationes se-cundo le inspection del mesme atlantes in tres series de roentgenogrammas-un total de plus que 250 pelliculas ab 190 individuos de etates inter 2 e 14,5 annos. Le sequente conclusiones esseva formulate:
1. In le studio de gruppos de pueros ab le
apparition de centros ossific usque al incipiente union epiphysee, le estimationes per
mesura-tion esserea le equivalente de estimationes per inspection in comparar (a) vabores medie del etate skeletic, (b) variationes del etate skeletic
ab un individuo al altere, e (c) progresso medie,
i.e. augmento medie del etate skeletic.
2. Proque le reproduction a “dimensiones natural” in be preparation del atlantes non es exacte, le methodo del mesuration-excepte si
illo labora con intermediarios-non permitterea
comparationes digne de confidentia, mesmo de
gruppos de individuos, con le subjectos
stand-ard (in le atlantes). Sed il esserea possibile eliminar iste disavantage per publicar, in boco
de un atbante, tabulas de equivalentias de
me-sura e etate, derivate ab roentgenogrammas standard e supplementate per un certe numero
de reproductiones como guida in le manovra
del mesuration.
3. In be evabutation del stato o del progresso del etate skeletic de un puero individual, be differentias inter be etate per inspection e le etate per mesuration varia si fortemente ab un pellicula al proxime que le methodo del
me-suration es apparentemente de pauc valor. Sed
in su application a iste objectivo, etiam le