Chapter 1 Introduction 1
SAS: The Complete Research Tool 1 Objectives 2
A Note About Syntax and Examples 2 Syntax 2
Examples 3 Organization 4
Chapter by Chapter 4 What This Book Is Not 5
Chapter 2 SAS Programming Concepts 7
Fundamentals 7 Datasets 7 Units of Work 8 Syntax 9 Defaults 10
Running the Program 10 An Annotated Example 12 Recap 15
Chapter 3 Preparing the Data for Analysis 17
Understanding the Input Data 18 Missing Data 19
Naming the Dataset: The DATA Statement 19
Locating the Data: The INFILE, CARDS, and DATALINE Statements 20 Describing the Data Layout: The INPUT Statement 21
List Input 21 Column Input 22 Formatted Input 23
INPUT Statement Formats 23 Mixing Input Styles 24 Which Style to Use? 25 Common Problems 25
Permanent Versus Temporary SAS Datasets 26 What's in the Dataset? 26
Linking to the System: The LIBNAME Statement 27
"Two-Level" Dataset Names 28 Recap 28
Syntax Summary 28 Content Review 29
Chapter 4 Introduction to DATA Step Programming 31
Documenting Your Work: Using Comments 32 Reading SAS Datasets: SET 33
Syntax 33 Examples 34
Performing Calculations: Assignment Statements 35 Syntax and Usage Notes 35
Examples 36
Building Logical Expressions 37 Comparison Operators 37 Logical Operators 38
Examples 39
Conditional Execution: IF-THEN -ELSE 39 Forms of IF-THEN-ELSE 40 Examples 40
A Note About Receding 41 Selecting Observations: OUTPUT 42 Increasing Variable Description: LABEL 43 Controlling Display of the Data: FORMAT 43 Recap 45
Syntax Summary 45 Content Review 46
Chapter 5 Combining Datasets 47
Methods 48
Concatenation 48 Interleaving 48 Matched Merge 48 Terminology 48
Dataset Options 50 Concatenation 50
Examples 51 Interleaving 53
Example 53 Matched Merge 54
Syntax and Usage 54 Examples 56 Recap 59
Syntax Summary 59 Content Review 59
Chapter 6 Introduction to Procedures 61
Statements Used with Most Procedures 62
Clarifying Content: TITLE and FOOTNOTE 62 Processing Data in Groups: BY 63
Controlling Appearance: FORMAT and LABEL Revisited 63 Selecting Observations: WHERE 64
Identifying Groups of Variables: Variable Lists 65 Displaying the Dataset: PRINT 66
Example 66
Identifying the Dataset 67
Identifying and Ordering Variables 68 Requesting Counts and Sums 69 Aesthetics 69
Example 70
Rearranging the Dataset: SORT 72 Specifying the Datasets 72
Specifying Observation Order: The BY Statement 72 Sorting Options 73
Getting Information about the Dataset: CONTENTS 74 Identifying the Datasets 74
Controlling the Amount of Output 74 Example 74
Recap 75
Syntax Summary 75 Content Review 76
What's in a PROC Name? 77 A Warning 78 The Roadmap 78
Univariate Statistics 78
Simple Bivariate Tests and Measures of Association 79 Comparing Means and Analysis of Variance 79
Estimating Prediction Equations with Regression 80 Specialised Regression Models 81
Models for More than One Dependent Variable: Multivariate Statistics 81 Measurement and Scaling Techniques 82
Grouping and Predicting Group Membership 83 Navigating the Documentation Maze 83
Documentation 84 Recap 85
Chapter 8 Statistics for Single Variables 87
Examining Distributions of Categorical Variables 87 Creating Frequency Distributions 87
Visualizing Frequency Distributions with Bar Charts 91 Examining Distributions of Continuous Variables 98
PROC MEANS: Basic Summary Statistics for Continuous Variables 102 Basic MEANS Syntax 102
Display and Analytical Options 102 Specifying Analysis Variables 103
PROC UNIVARIATE: Obtaining Detailed Statistics, Tables, and Graphs 104 Basic UNIVARIATE Syntax 104
Display and Analytical Options 107 Recap 109
Syntax Summary 109 Content Review 110
Chapter 9 Statistics for Relationships with Continuous
Dependent Variables 111 ANOVA Statistics for Categorical Independent Variables 112Visualizing Group Distributions with PROC CHART 112 Visualizing Distributions with PROC UN IV API A TE 115
Testing Differences in Means for Two Independent Groups: PROCTTEST 116 Testing Differences in Means for Several Independent Groups: PROCGLM 118 Specifying Post Hoc Comparisons 121
Using PROC GLMfor Two-Way to N-Way ANOVAs 124 Testing for Differences of Related Means 128
Regression: Statistics for Continuous Independent and Dependent Variables 12 Visualizing Bivariate Relationships Using PROC PLOT 133
Estimating Regression Models with PROC REG 136
Analysis of Covariance: Statistics for Both Categorical and Continuous Independent Variables 143 Using PROC GLMfor ANCOVA 143
Interpreting PROC GLM ANCOVA Output 147 Using PROC REG for ANCOVA 148
Recap 152
Syntax Summ ary 152 Content Review 153 References 154
Chapter 10 Statistics for Relationships with Categorical
Dependent Variables 155Examining Bivariate Relationships 156
Producing a Crosstabulation with Tests and Measures of Association 156 Entering Previously Crosstabulated Data 162
Testing for Differences Between Groups on a Rank-Ordered Dependent Variable 164 Logistic Regression: Estimating a Prediction Equation for a Dichotomous Dependent Variable 167
PROC LOGISTIC: Estimating Logistic Regression Models with Continuous Independent Variables 168 PROC CATMOD: Estimating Logistic Regression Models with Categorical Independent Variables 174 Recap 179
Syntax Summary 179 Content Review 180 References 180
Chapter 11 More About SAS Programming 183
Options 184
Identifying System Settings 184 Frequently Used Options 185 DATA Step Programming Revisited 187
Grouping Related Statements: DO and END 187 Streamlining Calculations: Functions 187 Working with Dates 189
Background 190
Assigning Date Values 190 Manipulating Date Values 191 Displaying Date Values 192 Recap 193
Syntax Summary 193 Content Review 194
Chapter 12 PROCs That Create Datasets 195
Aggregating by Groups: The MEANS Procedure 195 The PROC MEANS Statement 196
Instructions for Aggregation: The CLASS Statement 197 Specifying Analysis Variables: The VAR Statement 197 Naming and Requesting Statistics: The OUTPUT Statement 198 Examples 200
Standardization of Data: The STANDARD Procedure 202 The PROC STANDARD Statement 202
Specifying Analysis Variables: The VAR Statement 203 Examples 203
Rank-Ordering Data: The RANK Procedure 204 Background 204
The PROC RANK Statement 204
Specifying Input and Output Variables: The VAR and RANK Statements 20\
Examples 206 Recap 207
Syntax Summary 207 Content Review 208
Chapter 13 Receding and Labeling with PROC FORMAT 209
Concepts 210 Syntax 210
Using Formats for Display 212 Analytical Use of Formats 214 Using Formats in DATA Steps 217
Syntax Summary 218 Content Review 218
Chapter 14 Working with Character Data 221
Principles 221 Length 221 Magnitude 221
Specifying Length: The LENGTH Statement 222 Character-Handling Operators 223
The Concatenation Operator 223 The Colon (:) Comparison Modifier 224 Character-Handling Functions 224
Obtaining Variable Information: LENGTH 225 Extracting Part of a String: SUBSTR 225
Locating Strings Within a String: INDEX Variants and VERIFY 226 Altering the Appearance of a String 227
Recap 229
Syntax Summary 229 Content Review 229
Chapter 15 Putting It All Together 231 Predicting Opinion About Taxes 231 Predicting Employee Turnover 242 Treating Fear of Snakes 255 Recap 259
Closing Comment 259 References 260
Appendix A Using the Display Manager 261
Display Manager Background 261 Syntax 265
A Sample Display Manager Session 266 Entering the Program 266 Running the Program 267 Moving Through the Output 268 Revising the Program 268 Saving Your Work 269 Leaving SAS 270
Retrieving Your Work 270
Customizing Your Environment with the KEYS Window 271 Syntax Summary 272
Command Line Commands 272 Line Commands 274
Appendix B Resources 277
Person-to-Person 277 Help Desks 277
"Power Users" 278 User Groups 278 Courses 278 Electronic Resources 279
Online Help 279 SAS/Assist 279
SAS Sample Library 279
The SAS-L List Server 280 World Wide Web 280 Hard Copy 281
Local Documentation 281 SAS Institute Publications 281 Books by Users 281
Appendix C Common Problems (and Solutions) 283
Divison by Zero 283 Unbalanced Quotes 284 DOing Without ENDing 285 Uninitialized Variables 286 Subtle Omissions of Periods 287 Display Manager "Stalls" 287 Data Type Conversion Messages 288
New Character Variables Appear Truncated 289 Subsetting into 0 Observations 289
Complaints About Correct Syntax 290 Unbalanced Parentheses 291
"Dataset Not Found" Messages 292 Every Calculated Value Is Missing 292 Conflicting Data Types 293
All Calculations Are Missing 294