Statistics, Research, & SPSS: The Basics
SPSS (Statistical Package for the Social Sciences) is a software program that makes the calculation and presentation of statistics relatively easy. It is an incredibly expensive piece of software
(http://www.spss.com/stores/1/Software_HigherEducation_C91.cfm), so please do not take access to it for granted as McDaniel College pays a large sum of money to make SPSS available to students. The biggest problem with SPSS is that it is too easy, and will tempt you to try statistical tests that are not appropriate for the data you have collected or for the Research Questions and Hypotheses you are proposing.
While we will go over everything you need to know in class, there are many resources freely available online for mastering SPSS. YOU are responsible for
mastering SPSS, and YOU need to practice, find alternative information sources, and fill in any gaps in your knowledge/skill sets regarding use of SPSS and statistics. A simple search for “SPSS tutorials” on Google will yield a host of useful resources.
The following information packet provides you with everything you need to know
about SPSS in order to be successful in this course as well as in Senior Seminar. This
information packet includes information on entering data, outputting data to MS
Word, creating variables, cleaning your data, performing descriptive statistics,
performing inferential statistics, and developing composite scales.
Table of Contents
Statistics, Research, & SPSS: The Basics ... 1
Table of Contents ... 2
Entering Data/Creating Variables ... 3
Cleaning Your Data... 6
Descriptive Statistics... 7
Inferential Statistics ... 10
Independent Samples T-Test...10
ANOVA ...10
Correlations...11
Linear Regression...12
Creating Scales (Factor Analysis/Reliabilities) ... 13
Validity and Reliability ... 15
Internal Validity ...15
External Validity...15
Ecological validity ...15
POPULATION VALIDITY ...16
Construct Validity ...16
Intentional Validity ...16
Content Validity ...16
Face Validity ...17
OBSERVATION VALIDITY ...17
Criterion Validity ...17
Concurrent Validity ...17
Predictive Validity...17
Convergent Validity ...17
Discriminant Validity ...18
FACTORS JEOPARDIZING VALIDITY ...18
Reliability ...19
Entering Data/Creating Variables
In SPSS, there are two views: DATA VIEW and VARIABLE VIEW. DATA VIEW is used for typing in data, and VARIABLE VIEW is used for creating variables. The key to typing in data is to type in responses across the page in DATA VIEW. So, when you are looking at an individual’s responses to your survey, for example, you need to type those responses (or their proper code) moving across the page (the first row = the first respondent’s answers, the second row = the second respondent’s answers, and so).
In order to type in the data, the data has to be coded (put into numbers). For example, you have a categorical variable called GENDER, and on your survey you have a question such as: 1. What is you gender? Male ___ Female ___. Then, if someone selects Male, you can code their response as a “0”, and if someone selects Female, you can code their response as a “1”. If you used a Likert Scale to represent a numerical variable such as WILLINGNESS TO COMMUNICATE (1 = Strongly
Disagree, 2 = Disagree, 3 = No Opinion, 4 = Agree, and 5 = Strongly Agree), then coding is easy since numbers are already attached to responses. The challenge in this case is to make sure all items are coded in the same direction. Compare the following 2 statements:
1. I like to talk whenever I have the chance.
1 2 3 4 52. I don’t like to talk even if I have the chance.
1 2 3 4 5A “1” on item 1 does not equal a “1” on item 2. In fact, the responses for the 2 items are opposites. And, so we would need to reverse code one of the items so that 1 = 5, 2 = 4, 3 = 3, 4 = 2, and 5 = 1. Generally, it is standard practice to put positive values on the right side and negative values on the left:
- N
N D
H egative
o isagree
ate
+ Positive
Yes Agree
Love
1 2 3 4 5 6 7
1 2 3 4 5
In VARIABLE VIEW, one has the opportunity to do a variety of things. First, one can give variables names, but with a few restrictions. The variable name must start with a letter, cannot have blank spaces, and can only be 8 characters long. Another feature that is commonly used is VALUES. “VALUES” allows you to assign numeric values to words. For example:
Value: 1
Value Label: Freshman Add 1.00 = “Freshman”
You must press Add after each value label. Another key function is MISSING.
MISSING allows you to assign a numeric value to missing data. Commonly, missing data is coded as 99. For example, we know we have values as follows:
1 = Strongly Disagree 2 = Disagree
3 = No Opinion 4 = Agree
5 = Strongly Agree
These are the only 5 options, however there is a “6” as a response for one
respondent. Either 1) data entry was wrong or 2) the respondent made a mistake.
Either way, the data is missing. So we can enter “6” as missing data using the MISSING feature.
Finally, MEASURE allows us to specify the type of variable a variable is. A variable can be either 1) nominal or categorical, 2) ordinal, 3) interval, or 4) ratio. Some statistics can only be performed on categorical variables. Some can only be performed on ratio. So, using MEASURE, we can specify what type of variable the variable is.
Nominal Variables: are categories and not numbers. For example, GENDER is a nominal variable consisting of 2 categories, MALE or FEMALE. (mode)
Ordinal Variables: the numbers assigned to objects are in a rank order; first, second, third… An example of an ordinal level variable would be SOCIAL
CLASS. We assume there is some difference between HIGH CLASS and UPPER
MIDDLE CLASS with HIGH CLASS being more than UPPER MIDDLE CLASS, but
the difference is not exact, and the differences between HIGH CLASS and
UPPER MIDDLE CLASS and the differences between UPPER MIDDLE CLASS and MIDDLE CLASS may not be the same amount even though they should be.
(median)
Interval Variables: have equal intervals between values. For example, temperature is measured using an interval scale. The difference between 1 and 2 degrees is the same as the difference between 4 and 5 degrees. Likert Scales and Semantic Differential Scales are interval level measures (though they are treated as ration level). (mean)
Ratio Variables: have all the features of the other variables plus they have a true zero. While there will never be a time when there is no temperature, it is possible (unfortunately) to have no money. Income is thus a ratio level variable. (variance)
Level
are nameshave an inherent order from more to less or higher to lower
are numbers with equal intervals between them
are numbers that have a theoretical zero point