There are different types of reliability measures. These measures are estimated by different methods. The chief methods of estimating reliability measures are illustrated in table 1 below.
Table 1: Methods of Estimating Reliability
to
se
143
succession Split – half
method
Measure of internal consistency Give test once. Score t equivalent halves say od and even number items, correct reliability coefficie to fit whole test by Spearman-Brown formula Kuder-
Richardson methods
Measure of internal consistency Give test once. Score t test and apply kuder- Richardson formula
wo d nt
otal
3.2.1 Test Retest Method – Measure of Stability
Estimating reliability by means of test-retest method requires the same
test to be administered twice to the same group of learners with a given time interval between the two administrations. The resulting test scoresare correlated and the correlation coefficient provides a measure of stability. How long the time interval should be between tests is determined largely by the use to be made of the results. If the results of
both administrations of the test are highly stable, the testees whose scores are high on one administration of the test will tend to score high
on other administration of the test while the other testees will tend to stay in the same relative positions on both administration of the test.
Such stability would be indicated by a large correlation coefficient.
An important factor in interpreting measures of stability is the time interval between tests. A short time interval such as a day or two inflates the consistency of the result since the testees will remember
some of their answers from the first test to the second. On the other hand, if the time interval between tests is long about a year, the results will be influenced by the instability of the testing procedure and by the
actual changes in the learners over a period of time. Generally, the longer the time interval between test and retest, the more the results will
be influenced by changes in the learners’ characteristics being measured
and the smaller the reliability coefficient will be.
3.2.2 Equivalent Forms Method - Measure of Equivalence
To estimate reliability by means of equivalent or parallel form method involves the use of two different but equivalent forms of the test. The two forms of the test are administered to the same group of learners in
close succession and the resulting test scores are correlated. The resulted correlation coefficient provides a measure of equivalence. That is, correlation coefficient indicates the degree to which both forms of the test are measuring the same aspects of behaviour. This method reflects
the extent to which the test represents an adequate sample of the characteristics being measured rather than the stability of the testee. It
eliminates the problem of selecting a proper time interval between tests
144
as in test retest method but has the need for two equivalent forms of the
test. The need for equivalent forms of the test restricts its use almostentirely to standardized testing where it is widely used.
3.2.3 Split-Half Method – Measure of Internal Consistency.
This is a method of estimating the reliability of test scores by the means
of single administration of a single form of a test. The test is administered to a group of testees and then is divided into two halves
that are equivalent usually odd and even number items for scoring purposes. The two split half produces two scores for each testee which
when correlated, provides a measure of internal consistency. The coefficient indicates the degree to which equivalent results are obtained
from the two halves of the test. The reliability of the full test is usually
obtained by applying the Spearman-Brown formula.
That is, Reliability on full test = 2 x Reliability on ½ test 1 + Reliability on ½ test
The split-half method, like the equivalent forms method indicates the extent to which the sample of test items is a dependable sample of the content being measured. In this case, a high correlation between scores on the two halves of a test denotes the equivalence of the two halves andconsequently the adequacy of the sampling. Also like the equivalent-
forms method, it tells nothing about changes in the individual from onetime to another.
3.2.4 Kuder-Richardson Method – Measure of Internal Consistency.
This is a method of estimating the reliability of test scores from a single administration of a single form of a test by means of formulas such as those developed by Kuder and Richardson. Like the spilt-half method, these formulas provide a measure of internal consistency. However, it
does not require splitting the test in half for scoring purposes. Kuder-
Richardson formula 20 which is one of the formulas for estimating internal consistency is based on the proportion of persons passing each
item and the standard deviation of the total scores. The result of the analysis using this formula is equal to all possible split-half coefficients for the group tested. However, it is rarely used because the computation is rather cumbersome unless information is already available concerning the proportion of each item. Kuder-Richardson formula 21, a less accurate but simpler formula to compute can be applied to the results of
any test that has been scored on the basis of the number of correct
answers. A modified version of the formula is:
Reliability estimate (kR21) = k
﴾1 – (m (k-m)))145
Where
K = M = S =
K – 1 ks
2the number of items in the test the mean of the test scores
standard deviation of the test scores.
The result of this formula approximates that of Kuder-Richardson formula 20. It has a smaller reliability estimate in most cases.
This method of reliability estimate test whether the items in the test are homogenous. In other words, it seeks to know whether each test item measures the same quality or characteristics as every other. If this is established, then the reliability estimate will be similar to that provided
by the split-half method. On the other hand, if the test lacks homogeneity an estimate smaller than split-half reliability will result.
The Kuder-Richardson method and the Split-half method are widely used in determining reliability because they are simple to apply.
Nevertheless, the following limitations restrict their value. The
limitations are:
They are not appropriate for speed test in which test retest or
equivalent form methods are better estimates.
They, like the equivalent form method, do not indicate the constancy
of a testee response fromday to day. It is only the test-retest procedures that indicate the extent to which test results are
generalizable over different periods of time.
They are adequate for teacher-made tests because these are usually