QM STEM Ed
1
Quantitative Methods in STEM
Education Research
Topic 2: Quantitative research
design, data collection, and
measurement
Judy Sheard
Faculty of Information Technology Monash University, Australia
Overview of topic 2
Variance
Sampling
Data collection – questionnaires
Variables
Measurement
QM STEM Ed 3
Research design for quantitative
research
Quantitative research tends to be structured
and prescriptive in nature.
The outcomes are expressed as numbers.
These numbers must be interpreted by the
researcher to produce valid and usable
results – these results are used to answer research questions or problems.
Controlling variance
In a research study, there will be differences in measurements taken for variables –
individuals are not all the same!
An important part of quantitative research is explaining or controlling for variance
(difference) … we will meet a formal definition
later.
Controlling variance means creating conditions where the researcher can get a clear view of variables of interest while eliminating limiting or explaining influences of other variables.
QM STEM Ed 5
Controlling variance
Methods for controlling variance:
Randomisation – spread the effect of the
variance.
Building conditions or factors that may vary
into the design as independent variables –
i.e. variables of interest that will be studied.
Holding conditions or factors constant.
Statistical adjustment – adjusting results to
Controlling variance – an example
A study of the effect of problem-based learning (PBL) on the performance of introductory programming students. Sixty students are to be taught using PBL or a traditional method. The effect will be measured by performance on an end of semester exam.
Possible sources of variance:
Prior programming experience of the students
Teacher
Degree program (undergraduate or postgraduate)
QM STEM Ed 7
Controlling variance – an example
Variable Method of controlling variance
Prior programming experience Randomisation – randomly assign students to a group
Teacher Use as an independent variable
Degree program (undergrad or postgrad)
Reduce to a constant - include only undergraduate students
Ability level of students Statistical control – e.g. use university entry score
Characteristics of good
research design
Freedom from bias – e.g. from biased
assignment of individuals.
Freedom from confounding – two or more
variables are confounded if their effects cannot be separated.
Control of extraneous variables – variables
not of primary interest to the study.
Statistical precision for testing hypothesis –
QM STEM Ed 9
Sampling
A census is the collection of information from
all members of a population.
In experimental, quasi-experimental and
survey research a census is typically not feasible, desirable or possible.
Instead, the researcher uses a subset or
Sampling: some concepts
Sample – the set of individuals to be
measured.
Population – the set of individuals to which
the conclusions apply.
Sampling Frame – the set of individuals from
which the sample is derived. May be a subset of the population.
QM STEM Ed 11
Deriving a sample
Decide upon the unit of analysis
Define the population
Delineate the sampling frame
Generate the initial sample
An example
A researcher wants to find out about engineering students’ use of educational resources.
Unit of analysis – individuals
Population – all engineering students
Sampling frame – engineering students at Uppsala
Initial sample – a group of 100 engineering students
QM STEM Ed 13
Deriving the sample
If we wish to generalise back to the population then the initial sample should be:
Unbiased
Biased samples
A sample is biased when some individuals are more
likely to be sampled than others – also called a
non-random sample.
For example, a lecturer wants to determine how much time his students spend preparing for exams.
He collects data from students in one lecture towards the end of semester.
QM STEM Ed 15
Biased samples
A physics department conducts course
evaluations via email questionnaires sent to the students.
Generating an unbiased sample
A sample is unbiased when all individuals are
equally likely to be sampled – also called a random
sample.
Random sampling involves probability sampling –
each member of the target population has a known, non-zero chance of being selected. This allows:
unbiased estimates of population characteristics; and
accurate assessment of sampling error – variation due to
random fluctuation.
Probability sampling is recommended if accurate
QM STEM Ed 17
Generating an unbiased sample
Methods used to generate an unbiased sample:
Simple random sampling
Systematic sampling – (e.g. every 5th person on a
list)
Cluster sampling – used with a population or
sampling frame of natural groupings: Randomly select a subset of the groups.
Representative samples
Biased samples are often very unrepresentative. For
example, in the exam preparation study:
Unbiased samples are often sufficiently representative. However, unbiased samples are not always sufficiently
representative.
Population Sample
33% poor attenders 5% poor attenders 67% regular attenders 95% regular attenders
QM STEM Ed 19
Stratified sampling
Stratified sampling can overcome the
problem of non-representativeness.
Stratified sampling process:
Divide population into appropriate groups called
strata.
Draw a random sample from each group.
Therefore, in the previous example we would
select 33% from the poor attender population (may be difficult) and 67% from the regular attender population.
Other sampling techniques
There are other forms of sampling that will not enable us to generalise back to the
population – purposeful sampling. For
example, sampling for extreme cases, maximum variation, special cases, etc.
QM STEM Ed 21
The final sample
The final sample may differ from the initial
sample – people may refuse to participate.
Although the initial sample may be
representative of the target population – this may not be the case with the final sample.
Need to ensure that the final example is
representative by comparing the final sample with the initial sample or the target
Sample size
Determining sample size can be complex. The researcher needs to consider:
Cost, time, and resources.
Access to, availability and willingness of subjects. Precision of statistical analysis.
Variability within the target population.
The chosen sampling scheme.
QM STEM Ed 23
Some “rules of thumb”
Statistical analysis of samples less than 10
not recommended for many tests.
Samples of 30 or more recommended for
many tests.
When samples are divided for analyses then
the “rules of thumb” apply to the sub-samples.
In regression, factor analysis and other
multivariate research, a minimum sample
Unit of analysis
The unit of analysis may be individuals, pairs,
classes, year level, discipline of study, institution …
The data collected pertains to the unit of
analysis, e.g. if the unit of analysis is
individuals, each datum pertains to a person.
In educational research, we often study individuals.
QM STEM Ed 25
Unit of analysis (cont.)
“… the majority of studies of educational effects – whether classroom experiments, or evaluations of programs or surveys – have collected and analyzed data in ways that conceal more than they reveal. The established methods have generated false conclusions in many studies.” (Cronbach, 1976)
Different aggregations and levels of analysis may
produce quite different results - masking of effects may occur.
Need to consider multilevel and within- and
between- group analyses.
Need to ensure sufficient sample size for the unit of
Data collection methods
Many data collection methods used in quantitative research. For example:
Questionnaires Observations Structured interviews Validated instruments Assessment performance Log files
QM STEM Ed 27
Questionnaires
Many studies in educational research involve
administering a questionnaire.
In designing a questionnaire the researcher
must consider many things including:
information required
profile of the respondents
form of the data
data entry
Questionnaire design
Determine the information needed.
Select the types of questions – consider the planned
analysis.
Develop the questions.
Determine the sequence of the questions.
Test the questions for understanding – and revise.
Prepare a data summary.
Develop the completed questionnaire.
Pilot the questionnaire.
(Phillips, 1991)
QM STEM Ed 29
Basic question types
Binary response
Multiple choice
Checklist
Ranking scales
Why pilot your questionnaire?
Please indicate the total time in hours you spent in preparation for the exam.
0-10
□
11-20□
21-30□
more than 30□
Results
QM STEM Ed 31
Why pilot your questionnaire?
Please indicate the total time in hours you spent in preparation for the exam.
0-2
□
3-5□
6-10□
more than 10□
Results
Some tips …
Make it easy for the respondents to complete the questionnaire. Offer “Don’t know” and “Not applicable options” where appropriate. Don’t collect data you don’t need.
Consider question sequencing.
Consider anonymity for sensitive topics.
Avoid checklists – respondents may not consider all the options. Avoid double-barrelled questions:
Q. My students enjoyed the lectures and found them useful? True or False?
Avoid emotive, distractive, offensive language:
Q. How do you feel about the University’s policing of staff stationery use?
No comment
It’s OK, I suppose It irritates me I hate it
QM STEM Ed 33
Variables
A variable is any attribute or property that
differs between people or varies across time.
In research we often examine the relationship
Types of variables
Independent variables – assumed to produce an effect
on, or be related to, a phenomena of interest. In
experimental research they are manipulated by the researcher. May be used for classification.
Dependent variables – are measured but not
manipulated or controlled.
An example: A study of cheating behaviour of students in different disciplines.
independent variables – discipline of study, age,
gender
dependent variables – frequency of cheating,
QM STEM Ed 35
Other types of variables
Moderator variable – a special type of independent
variable, not of primary interest, that affects the relationship of the independent variable to the dependent variable.
Intervening variable – a hypothetical entity which is
inferred from the effects of the independent variable on the dependent variable.
Control variable – independent variable that is held
Types of variables
Independent variables Teaching resources Teaching method Control variables Gender Age Intervening variables Teaching style Learning style Dependent variables Use of resources Course achievement moderator variablesA study of how the teaching environment influences students learning behaviour and achievement.
QM STEM Ed 37
Data characteristics
Two broad distinctions:
Discrete – data has specific values, e.g.
number of students in a class, grade achieved, gender
Continuous – measurement from a
continuous interval, e.g. age, time on task The type of data largely determines how the
Measurement
Measurement – the process of assigning numerals according to
rules. The numerals are assigned to events, objects, responses to items, observed behaviours, etc. (Wiersma, 2005)
Many ways to measure - need to determine what is being measured and how it is to be measured.
Educational research encompasses a variety of:
possible variables – e.g. exam results, age, learning style, course
satisfaction, resource usage
measurement instruments – tests, questionnaires, inventories,
QM STEM Ed 39
Types of measurement scales
Nominal (or categorical) – measures without order;
allows classification or grouping, e.g. course, gender.
Ordinal – measures with order; allows ranking, e.g.
grade, attitude towards course.
Interval – measures with order and equal intervals
on a scale, e.g. test score, IQ.
Ratio – measures with order, equal intervals on a
scale and a true zero point, e.g. age, length.
Note that in SPSS no distinction is made between interval and ratio.
Levels of measurement
Ordered Equal interval True zero Nominal Ordinal √ Interval √ √ Ratio √ √ √QM STEM Ed 41
What about Likert scales?
strongly agree strongly disagree
What type of measurement scale is this?
… and these Likert scales?
strongly agree strongly disagree
What type of measurement scale is this?
1 2 3 4 5
QM STEM Ed 43
Variables in educational research
Educational research covers a broad spectrum of phenomena. In educational research we are dealing with people. Our research may involve measurement of:
Student achievement
Demographics
Behaviours
Attitudes
Measuring attitudes and behaviour
Sometimes the concepts we are measuring are
abstract.
In order to measure a concept or phenomena under
study we need an operational definition.
Standard tests and inventories are available for
measuring many of these. E.g. MBTI, MSLQ
Or, we may need to develop our own instrument.
Need to consider reliability and validity of the
QM STEM Ed 45
Measurement considerations
Minimise error – consider reliability and
validity.
Low reactivity – measurements should not
affect peoples’ attitudes, beliefs, etc.
Powerful – measurements should be
sensitive enough to detect effects if they are present.
Errors relating to measurement
= +
Actual score Systematic part Random part
The measured score may vary markedly from the actual score
• The systematic part may contain errors. E.g. it may measure Creativity rather than IQ
• The random part may be pronounced.
A measure may contain a systematic and a random component
QM STEM Ed 47
Errors relating to measurement
When the systematic part comprises
negligible error, the measure is valid – that is,
the measure reflects the aspect that it is intended to reflect.
When the random part is negligible, the
measure is reliable – that is, the measure is stable under different circumstances and at times.
Reliability
Factors that may reduce reliability:
Subject’s state (e.g. boredom, motivation)
Subject’s ability (e.g. memory, understanding)
Environmental conditions (e.g. noise, comfort)
Researcher’s competence (e.g. recording errors)
Changes in measurement apparatus or the person
QM STEM Ed 49
Estimating reliability
Various procedures to estimate reliability. These are based on associations between scores.
Parallel forms – two or more equivalent forms of a test are
administered to the same individuals.
Test-retest – same test administered on two or more occasions
to the same individuals.
Split-half – one administration of a test. Test is divided into 2
halves with items that match on content and difficulty.
Inter-rater reliability – two or more researchers complete the
same task.
These produce a reliability coefficient that can take values from 0 to 1.0 inclusive.
Estimating reliability
–
Cronbach’s alpha
Cronbach’s alpha is a measure of the
intercorrelation of items of a test – coefficient alpha.
A high alpha (> 0.7) suggests various items
are correlated with each other.
Items that correlate at or nearly 1.0 are
considered unidimensional and may be combined in a scale.
QM STEM Ed 51
Validity
Measurement validity is inadequate when:
The measure is under-represented – the measure
does not capture all aspects of the relevant behaviour or attribute.
The measure is over-represented – the measure
captures aspects that were not meant to be captured.
Different forms of validity – face, content, criterion and
construct. Another view is that validity is a unitary
Establishing validity
Face validity – Does the test seem to capture the
behaviour or attribute of interest? (logical and subjective)
Content validity – Ensure the measure does not
seem to under or over estimate the attribute of interest. (logical analysis of items)
Criterion validity – Does the measure correlate with
other attributes recorded at the same time
(concurrent) or other characteristics recorded later (predictive)? (empirical)
Construct validity – Do the items associated with the
measure relate to the intended construct or concept? (logical or empirical)