INTERPRETING THE TEST SCORES
4.1 THE PERCENTAGE CORRECT SCORE: What does score test mean?
A test score is a piece of information, usually a number, that conveys the performance of an examinee on a test. One formal definition is that it is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being measured."
Test scores are interpreted with a norm-referenced or criterion-referenced interpretation, or occasionally both. A norm-referenced interpretation means that the score conveys meaning about the examinee with regards to their standing among other examinees. A criterion-referenced interpretation means that the score conveys information about the examinee with regards a specific subject matter, regardless of other examinees' scores.
Types of Test Scores
There are two types of test scores: raw scores and scaled scores. A raw score is a score without any sort of adjustment or transformation, such as the simple number of questions answered correctly. A scaled score is the results of some transformation applied to the raw score.
The purpose of scaled scores is to report scores for all examinees on a consistent scale. Suppose that a test has two forms, and one is more difficult than the other. It has been determined by equating that a score of 65% on form 1 is equivalent to a score of 68% on form 2. Scores on both forms can be converted to a scale so that these two equivalent scores have the same reported scores. For example, they could both be a score of 350 on a scale of 100 to 500.
Two well-known tests in the United States that have scaled scores are the ACT and the SAT. The ACT's scale ranges from 0 to 36 and the SAT's from 200 to 800 (per section). Ostensibly, these two scales were selected to represent a mean and standard deviation of 18 and .6 (ACT), and 500 and 100. The upper and lower bounds were selected because an interval of plus or minus three standard deviations contains more than 99% of a population. Scores outside that range are difficult to measure, and return little practical value.
Note that scaling does not affect the psychometric properties of a test, it is something that occurs after the assessment process (and equating, if present) is completed. Therefore, it is not an issue of psychometrics, per se, but an issue of interpretability.
Interpretation the Score by Criterion Referencing
The raw score is number of points received on a test when the test has been scored according to the instructions. Raw score is not very meaningful without further information. Criterion-referenced test interpretation permits us to describe an individual's test performance without referring to the performance of other individuals. Thus we might describe a student's performance in terms of the speed, precision with which a certain task is performed. Criterion-referenced interpretation of test scores is most meaningful when the test is designed to measure a set of clearly stated learning tasks. Enough items are used for each interpretation to make dependable Judgments.
Interpretation the Score by Percentages
In mathematics, a relationship with 100 is called percentage (denoted by %). Often it is useful to express the scores in terms of percentages for comparison. Consider the following example.
Grade Class A No. ofStudents % Class B No. ofStudents % A B 10 25 12.50 31.25 8 6 40 30
C D 30 15 37.50 18.75 4 2 20 10 Total 80 100 20 100
Ten students from class A and eight students from class B got grade A. It looks apparently that class A is better in getting A grade but 12.5% of the students from class A and 40% students from class B got grade A. It is clear from the percentages that class.
B is far better in getting grade A than class A. Interpretation the Score by Norm Referencing
Interpretation of scores by norm referencing involves making of scores and expressing a given score in relation, to the other scores Norm- referenced test interpretation tells us how an individual is compared with other persons who have taken the same test. The simplest type of comparison is to rank the scores from highest to lowest and to note where an individual's score falls. The rest of the scores serve as the norm group. The given score is compared with the other scores by norm referencing. If a student's score is second from the top in a group of 20 students, it is a high score meaning that the scores of 90% of the students are less than him.
Ordering and Ranking
A first step in organizing scores in the listing of scores in order of magnitude from largest to the smallest score. The data so arranged are called ordered array. By scanning an ordered array, we can determine quickly the largest score, the smallest score and other facts about the data.
Ranked data consists of scores in a form that shows their relative position on some characteristic but does not yield a numerical value for this characteristic. The order of finish of cars in a race is an example of ranking. If we list the cars as first, second, third etc. up to the last car, we can say that they were ranked on the characteristic of overall
speed. We know each car's position relative to any other car's position but we have no precise knowledge of the speed of any car. A high school teacher ranked Hamid 30th in a class of 100 means that Hamid did better than 70 of his classmates but poorer than 29. But nothing has been aid about Hamid's general level of achievement.
Measurement Scales
Measurement scales are of great significance in analyzing and interpreting results. The important types of measurement scales are: The Nominal Scale
The lowest measurement scale is the nominal scale. In this scale, each individual is put into one of the distinct categories or classes. Each class has a name. The names are just labels. There is no order in these classes. We cannot say that one class is larger than the other class. You cannot do arithmetic operations (addition, subtraction, multiplication, division) on this scale.
Examples of the nominal scale are Categorization of blood groups of the students of a college into A, B, AB and 0 groups. We cannot say that group A is better than group B. Classification of books in a college library according to subjects.
Distribution of the population of Pakistan according to sex, religion, occupations, marital status, literacy etc., is examples of the nominal scale.
The Ordinal Scale
When measurements are not only different from category to category but can also be ranked according to some criterion, they are to be measured on an ordinal scale. The members of anyone category are considered equal but members of one category are considered lower than those in another category. The ordinal scale is one-step higher than the nominal scale because we distribute the individuals not only in classes but we also order these classes.
Examples of the ordinal scale are Categorization of schools according to their educational level into primary, middle, secondary or higher secondary is an ordinal scale. There is an order in these classes. The primary level is lower than the middle level and the middle level is lower than the secondary level. You cannot do arithmetic operations on this scale.
Individuals may be classified according to socioeconomic status as low, medium, high. Intelligence of students may be average, above average or below average. Classification of examination results into different grades A), A, B, C, D, E etc. In this measurement scale, we can say that one individual is larger than the other but we cannot say how large it is.
The Interval Scale
In this scale, it is not only possible to order measurement but also the distance between two measurements is known. We can say that the difference between two measurements 30 and 40 is equal to the difference between measurements 40 and 50. The level of the interval scale is higher than the nominal and the ordinal scales. This is truly a quantitative scale. A unit of measurement and a zero point are required for this scale. The selected zero point is not necessarily a true zero. It does not have to indicate a total absence of the quantity being measured. We measure height in meters or feet, weight in kilograms or pounds, temperature in centigrade or Fahrenheit, income in rupees and the time in seconds. Arithmetic operations can be done on this scale. You can add the income of a wife to that of his husband.
The Ratio Scale
The highest level of measurement is the ratio scale. Equality of ratios as well as equality of intervals is determined in this scale. Fundamental to the ratio scale is the true zero point. The measurement of height, weight and length makes use of the ratio scale.
Frequency Distribution
Data that have been originally collected is called raw data or primary data. It has not yet undergone any statistical technique. To understand the raw data easily, we arrange into groups or classes. The data so arranged is called groups data or frequency distribution.
General rules far the construction of a frequency distribution: 1. Determine the Range. Range is the difference between highest
and lowest scores.
2. Decide the appropriate number of class intervals: 'There is no hard and fast formula for deciding the number of class intervals. The number of class intervals is usually taken between 5 and 20 depending on the length of the data.
3. Determine the approximate length of the class interval by dividing the range with number of class intervals.
5. Determine the limits of the class intervals taking the smallest scores at the bottom of the column to the largest scores at the top.
5. Determine the number of scores falling in each class interval. This is done by using a tally or score sheet.
Example:
The marks obtained by 120 students of first year class in the subject of Education are given below-Construct a frequency distribution. 57 86 69 62 75 73 80 78 87 83 77 35 70 68 84 73 81 78 61 72 59 98 95 63 76 73 88 60 52 83 86 45 70 53 85 74 62 78 89 84 60 79 91 64 84 85 81 79 90 78 83 50 71 65 76 58 71 79 51 61 61 89 81 74 76 74 82 91 71 76 80 52 71 66 77 65 44 79 95 74 79 63 83 87 77 75 83 48 70 85 61 70 72 67 61 83 75 79 97 75 66 54 81 6 78 75 83 61
8 33 76 62 55 72 76 78 75 99 80 83 86
The following steps are followed to-make a frequency distribution. 1. Step-1: Range = maximum score-minimum score = 99 — 33 =
66.
2. Step-2: Number of approximate class intervals to be taken is 7. 3. Step-3: Length of the class intervals, usually denoted by i, is.
I =