• No results found

Validity & Reliability

N/A
N/A
Protected

Academic year: 2021

Share "Validity & Reliability"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

VALIDITY &

VALIDITY &

RELIABILITY

RELIABILITY

(2)

QUALITIES OF MEASUREMENT 

QUALITIES OF MEASUREMENT 

DEVICES

DEVICES

  ValidityValidity

Does it measure what it is supposed to measure?

Does it measure what it is supposed to measure?

 ReliabilityReliability

How representative is the measurement?

How representative is the measurement?

 PracticalityPracticality

Is it easy to construct, administer, score and

Is it easy to construct, administer, score and

interpret?

interpret?

 BackwashBackwash

What is the impact of the test on the

What is the impact of the test on the

teaching/learning process?

(3)

QUALITIES OF MEASUREMENT 

DEVICES

 In Psychology, we judge the quality and goodness of measuring devices by two

psychometric criteria: reliability & validity

 If a predictor is not both reliable and

(4)

VALIDITY

The term validity refers to a standard for evaluating tests that refers to the accuracy or appropriateness of drawing inferences from test scores.

It refers to whether or not a test measures what it intends to measure.

On a test with high validity the items will be closely linked to the test’s intended focus.

If a test has poor validity then it does not measure the  job-related content and competencies it ought to.

(5)

VALIDITY

There are several ways to estimate the validity of a test, including content validity, construct validity,

criterion-related validity (concurrent & predictive) and face validity.

(6)

VALIDITY

 Content”: related to objectives and their sampling.  “Construct”: referring to the theory underlying the

target.

 “Criterion”: related to concrete criteria in the real world. It can be concurrent or predictive.

 “Concurrent”: correlating high with another measure already validated.

 “Predictive”: Capable of anticipating some later measure.

(7)

1. CONTENT VALIDITY

Content validity refers to the degree to which subject matter experts agree that the items in a test are a representative sample of the domain of knowledge the test purports to measure.

The test should evaluate only the content related to the field of study in a manner sufficiently representative, relevant, and comprehensible .

(8)

2. CONSTRUCT VALIDITY

It refers to the degree to which a test is an accurate and faithful measure of the construct it purports to measure.(concepts, ideas, notions).

Construct validity seeks agreement between a theoretical concept and a specific measuring device or procedure.

For example, a test of intelligence nowadays must include measures of multiple intelligences, rather than  just logical-mathematical and linguistic ability measures.

(9)

3. CRITERION-RELATED

VALIDITY

The degree to which a test forecasts or is statistically related to a criterion.

Also referred to as

instrumental validity 

, it

states that the criteria should be clearly

defined.

There are two major kinds of

CRITERION-RELATED VALIDITY- Concurent & prdictive.

(10)

4. CONCURRENT VALIDITY

Concurrent validity is used to diagnose the existing status of some criterion.

It is a statistical method using correlation, rather than a logical method.

In measuring concurrent criterion-related validity, we are concerned with how well a predictor can predict a criterion at the same time, or concurrently..

(11)

5. PREDICTIVE VALIDITY

This is another statistical approach to validity that estimates the relationship of test scores to an examinee's future performance.

Predictive validity considers the question, "How well does the test predict examinees' future status as masters or non-masters?

(12)

5. PREDICTIVE VALIDITY

For this type of validity, the correlation that is computed is based on the test results and the

examinee’s later performance.

This type of validity is especially useful for test purposes such as selection or admissions

.

(13)

6. FACE VALIDITY

Face validity is determined by the appearance that items in the test are appropriate for the intended use of the test by the individual who take the test.

Unlike content validity, face validity is not investigated through formal procedures. Instead, anyone who looks over the test, including examinees, may develop an informal opinion as to whether or not the test is measuring what it is supposed to measure.

(14)

6. FACE VALIDITY

Face validity alone is insufficient for establishing that the test is measuring what it claims to measure.

(15)

RELIABILITY

Reliability is the standard for evaluating tests that refers to the consistency, stability or equivalence of test scores.

It

is the extent to which an experiment, test, or any

measuring procedure shows the same result on

repeated trials.

 “Equivalency”: related to the co-occurrence of two items

(16)

1. EQUIVALENT-FORM

RELIABILITY

A type of reliability that reveals the equivalence of test scores between two versions or forms of the tests.

Equivalency reliability is the extent to which two items measure identical concepts at an identical level of difficulty.

Equivalency reliability is determined by relating two sets of test scores to one another to highlight the degree of relationship or association..

(17)

1. EQUIVALENT-FORM

RELIABILITY

For example, a researcher studying university English students happened to notice that when some students were studying for finals, they got sick. Intrigued by this, the researcher attempted to observe how often, or to what degree, these two behaviors co-occurred throughout the academic year. The researcher used the results of the observations to assess the correlation between “studying throughout the academic  year” and “getting sick”. The researcher concluded there was poor equivalency reliability between the two actions. In other words, studying was not a reliable predictor of getting sick.

(18)

2. STABILITY RELIABILITY

Stability reliability (sometimes called test, re-test reliability) reveals the stability of test scores upon repeated applications of the test.

To determine stability, a measure or test is repeated on the same subjects at a future date. Results are compared and correlated with the initial test to give a measure of stability.

(19)

2. STABILITY RELIABILITY

This method of evaluating reliability is appropriate only if the phenomenon that the test measures is known to be stable over the interval between assessments.

(20)

3. INTERNAL CONSISTENCY

Internal consistency is a type of reliability that reveals the homogeneity of the items comprising a test.

It is the extent to which tests or procedures assess the same characteristic, skill or quality.

This type of reliability often helps researchers interpret data and predict the value of scores and the limits of the relationship among variables

.

(21)

3. INTERNAL CONSISTENCY

For example, analyzing the internal reliability of the items on a vocabulary quiz will reveal the extent to which the quiz focuses on the examinee’s knowledge of words.

(22)

4. INTER-RATER RELIABILITY

Inter-rater reliability reveals the degree of agreement among the assessments of two or more raters.

Inter-rater reliability assesses the consistency of how a measuring system is implemented.

It is dependent upon the ability of two or more individuals to be consistent.

Training, education and monitoring skills can enhance inter-rater reliability.

(23)

4. INTER-RATER RELIABILITY

For example, when two or more teachers use a rating scale with which they are rating the students’ oral responses in an interview (1 being most negative, 5 being most positive). If one researcher gives a "1" to a student response, while another researcher gives a "5," obviously the inter-rater reliability would be inconsistent.

(24)

4. INTRA-RATER RELIABILITY

Intra-rater reliability is a type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions.

These different ratings are then compared, generally by means of correlation.

Since the same individual is completing both assessments, the rater's subsequent ratings are contaminated by knowledge of earlier ratings.

(25)

SOURCES OF ERROR

Examinee (is a human being)

Examiner (is a human being)

Examination (is designed by and for

(26)

RELATIONSHIP BETWEEN

VALIDITY & RELIABILITY

Validity and reliability are closely

related.

A test cannot be considered valid unless

the measurements resulting from it

are reliable.

Likewise, results from a test can be

reliable and not necessarily valid.

(27)

References

Related documents

When leaders complete Basic Leader Training at (name of council), they are asked to fill out an evaluation on the training. The immediate feedback new leaders give is

Members of the Canadian second chamber are effectively appointed by the Prime Minister, nominally to represent the provinces of the country.. However, appointments are actually

Primary energy dependence of the average shower maximum depth for proton- and iron-initiated vertical EAS, as calculated using the QGSJET-II-04 [17, 18], EPOS-LHC [14], and

In conclusion, this retrospective study sug- gests that use of a standardized IV combination treatment regimen is effective for acute pediatric migraine therapy in the ED by

The questionnaire (see Additional file 1 ) started with a characterization of the surgeons ’ professional profiles (coun- try, sex, surgical speciality, years of experience) and

"A Food-Based Approach Introducing Orange- Fleshed Sweet Potatoes Increased Vitamin A Intake and Serum Retinol Concentrations in Young Children in Rural Mozambique."

So, think of all people all over the world who buy sugared breakfast cereals that make health claims, for themselves, and their families including children.. Here is what

Coordinated multidisciplinary team work between Spain and Chile in the development of several MOOC (Massive Open Online Course) projects. Developmed Innovation, Strategic