SPECIAL ARTICLE

(1)

SPECIAL

ARTICLE

AMERICAN

BOARD

OF

PEDIATRICS

I

T IS desirable that as many people as possible should know how much care and

thought is given both to the preparation and to the subsequent analysis of the examinations of the American Board of Pediatrics. For this reason the following com-ments on the examination of January 1950 are published for general information.

The written examination of the American Board of Pediatrics given in January 1950

has been subjected to statistical analysis. It is proposed that all subsequent examinations be analyzed in a similar or improved way in order to learn whether modifications are

accomplishing the purpose for which they were made and in the endeavor to improve

the accuracy of the grading. Some of the results of this first statistical analysis may be of interest and may help in understanding how reliable the examination in its present form is.

The examination consisted of 200 false and true statements and was taken by 353

candidates. A majority of the candidates marked all of the statements as being either

false or true. That is to say, they marked with confidence when they knew and guessed when they did not know. However, a fairly large number of candidates refused to

commit themselves at all when they did not know. It is of interest that one of the highest

grades ever earned in these examinations was achieved by a candidate also distinguished

by having refused to commit himself on the largest number of statements. The method

of grading is one which yields essentially the same figure whether or not the candidate

elects to guess. To the number of correctly answered statements one adds half the number

of unanswered statements and from the total subtracts 100 ; the remainder is the grade.

The distribution of the grades earned by the 353 candidates is shown on the

accom-panying chart (histiogram) . The form of the histiogram suggests that we may be dealing

with an essentially normal distribution. The suggestion is corroborated by additional data on the chart. Medians and quartiles have been determined ; these statistics are quite

inde-pendent of an assumption of normality of distribution. The median is a grade which

divides the candidates in half in such a way that 50% have lower grades and 50% have

higher grades. The median is given as 54.9. Since no candidate can earn a grade ending

in the decimal 9. the meaning of the decimal deserves a word of explanation. Let us

sup-pose that 1 0 candidates earned a mark of exactly 54 and that 9 must be placed in the

lower one-half and 1 in the upper one-half to effect the equal division of the total

number of candidates about a median. Statistically this is done by using the decimal 9.

Q

uartiles are determined in a similar way. Since one-fourth of the grades are below the first quartile and one-fourth are higher than the third quartile, it follows that half the

grades are between the first and third quartiles. Again let us stress the point that these

statistics are determined by actual counting of the grades and do not depend on the type of distribution, normal or otherwise.

In contrast, the fidelity with which a mean (or average) and its standard deviation

describe a distribution does depend upon normality. These statistics have also been

(2)

AMERICAN BOARD OF PEDIATRICS 599

from it, the probable error of the distribution, has been used in preparing the chart.

This has been done because the range covered by the mean plus and minus one probable

error will for normally distributed data encompass 50% of the figures. Direct comparison

with the quartile figures is then possible. The essential identity of the ranges obtained by

the two methods of computation speaks for the normality of the distribution of these 353

grades.

AMERICAN BOARD OF PEDIATRICS

EXAMINATION OF JANUARY 950 - 353 CANDIDATES

80

-70 MEAN-IRE. 48.2 48.2

MEAN 55.6 54.9

MEAN+IPE. 63.1 63.3

.4 0 60 0 Id z

:

0 0 z > -40 I U ‘4 U) Id 30 0 0 z .4 U -20 IO

I

0_u_I

₂₀ ₂₅ ₃₀ ₃₅ ₄₀ ₄₅ ₅₀ ₅₅ ₆₀ ₆₅ ₇₀ ₇₅ ₈₀ ₈₅

EXAMINATION GRADES

COLUMN FIGURES REFER TO THE CENTRAL POINT OF EACH GROUPING OF GRADES

There are of course more elaborate and more exact methods of testing for normality

of distribution. In the present case the good to be gained does not justify the labor.

So much for the manner in which these 353 candidates are distributed with respect to

grades. A lowest mark of 17.5 and a highest mark of 85 provide ample range for

dif-ferential selection of candidates. There remains the important question of how accurately

the examination defines the position of the individual candidate against the background

(3)

re-600 SPECIAL ARTICLE

analysis of variance. The method is one in which total variability (total variance) is

expressed as the sum of the squares of the differences between each grade and the mean for all of the grades. The total variance can then be divided into the three components contributing to it. These components are: variability of ability among the 353 candidates; an actual difference in difficulty between the odd-numbered and even-numbered state-ments ; a residual variability which measures the degree of success of the examination

in fixing the grades of the individual candidate. We are interested here in this residual

variability from which an error of assessment can be computed. The error calculated

from the analysis of variance is that pertaining to an examination based on 100

state-ments. The error appropriate to the entire examination based on 200 statements is easily

derived from it by dividing by the square root of two. Calculated in this way the standard

error of assessment (usually called the standard error of measurement or S.E.M. ) turns

out to be 5.028. Similarly, the probable error of measurement or P.E.M. is 3.391.

The practical meaning of these statistics can be clarified in the following way : If it

were possible for a single candidate to take an unlimited number of similar examinations,

each based on 200 statements of identical difficulty, the average value of all of the grades

would be a constant. This constant is conveniently termed the true grade. For a single examination the odds are even, i.e., 50-50, that the assigned grade will differ from the

true grade by not more than 5.03. Derived from these statistics are the facts that one

time in 20 the assigned grade will differ from the true grade by more than 9.85 and

that once in a hundred times the difference will be greater than 12.95.

An additional and possibly useful item of information can be gleaned from these

statistics. If the relative abilities of

two

candidates are to be compared on the basis of their written examination grades, the odds are only slightly better than even of a real difference of abilities if the grades differ by only 5. We can be fairly sure of a real

difference (odds, 20 to 1) if the grades differ by 14 ; we can be highly sure (odds,

100 to 1) of a real difference if the grades differ by 18. It is hardly necessary to add that

the differences refer only to relative abilities in scoring on this type of false and true

examination. The extent to which the differences measure relative abilities in pediatrics

remains as the $64 question.

(4)

1951;8;598

Pediatrics

SPECIAL ARTICLE: AMERICAN BOARD OF PEDIATRICS

Services

Updated Information &

http://pediatrics.aappublications.org/content/8/4/598

including high resolution figures, can be found at:

Permissions & Licensing

http://www.aappublications.org/site/misc/Permissions.xhtml

entirety can be found online at:

Information about reproducing this article in parts (figures, tables) or in its

Reprints

http://www.aappublications.org/site/misc/reprints.xhtml

(5)

SPECIAL ARTICLE

SPECIAL

ARTICLE

AMERICAN

BOARD

OF

PEDIATRICS

I

Q

:

I

0_u_I

two

1951;8;598

Pediatrics

SPECIAL ARTICLE: AMERICAN BOARD OF PEDIATRICS

Services

Updated Information &

http://pediatrics.aappublications.org/content/8/4/598

including high resolution figures, can be found at:

Permissions & Licensing

http://www.aappublications.org/site/misc/Permissions.xhtml

entirety can be found online at:

Information about reproducing this article in parts (figures, tables) or in its

Reprints

http://www.aappublications.org/site/misc/reprints.xhtml

1951;8;598

Pediatrics

SPECIAL ARTICLE: AMERICAN BOARD OF PEDIATRICS

http://pediatrics.aappublications.org/content/8/4/598

the World Wide Web at:

The online version of this article, along with updated information and services, is located on

American Academy of Pediatrics. All rights reserved. Print ISSN: 1073-0397.