Data management and SAS Programming Language
EPID576D
-1-
Time: Tuesday and Thursdays, 11:00am – 12:15 pm
Location: Drachman Hall A319
Instructors: Angelika Gruessner, MS, PhD 626-3118 (office)
Drachman Hall A224
Office Hours: Monday – Thursday 11-12.15pm
Course Description:
This course will introduce students to the fundamentals of data management using the SAS programming language and management system. Emphasis will be placed on the management of large data sets and data manipulation, including reading, processing, recoding, and
reformatting of data. In addition, this course will be an introduction to the SAS system. The approach will be to teach by example, with an emphasis on hands-on learning. The course will cover the specific topics listed in the course objectives below.
Each participant should afterwards be able to take and pass the ‘Global Basic SAS Programming’ certification.
Course Prerequisites:
Basic computer literacy, CPH 576A & CPH 573A or permission of the instructor
Course Objectives:
On completion of the course, students will be able to:
1. Build or import a SAS data set from raw data, spread sheets, or database systems
2. Reformat / recode data
3. Read, combine and restructure multiple data sets 4. Use SAS functions
5. Use SAS dates 6. Use SAS Arrays
7. Perform data cleaning procedures
8. Conduct ‘exploratory data analysis (EDA)’ using SAS 9. The ODS system and SAS graphics
10. Integrate SAS Programming with Data Management and Statistical Analyses
11. Usage of SQL for data import and manipulation
12. Develop SAS macros
Data management and SAS Programming Language
EPID576D
Course Notes:
Students will be given a copy of the power point presentation and handouts for that day. Students may also be given additional examples of specific SAS programs which will also be discussed in class. Students are responsible for ALL material distributed during the semester. Competencies:
• Ability to identify appropriate statistical tools to address specific scientific questions
• Demonstrate excellent presentation skills and the ability to explain statistical concepts and findings to a general scientific audience
• Demonstrate advanced working skills in application of computer systems and appropriate statistical software
• Demonstrate understanding of methods of data analysis and data monitoring
It is essential to have an USB key to save all work
Recommended Texts/Readings:
o Delwiche,L.D. and Slaughter, S.J., The Little SAS Book, A Primer; Fourth Edition; The SAS Institute, 2009
o Geoff Der and Brian S. Everitt: A Handbook of Statistical Analyses Using SAS, CRC Press, 2009
o Cody, R, and Pass, R. SAS Programming by Example, The SAS Institute, 2007
o A. J. Bailor, Statistical Programming in SAS, The SAS Institute, 2010
Data management and SAS Programming Language
EPID576D
-3-
Course Requirements
:
Student Expectations: The only way to learn programming is to practice; in fact most programmers learn by example. This is therefore a hands-on class in which students are
expected to complete ALL tasks in a timely manner. Interaction among the students and with the instructor is STRONGLY encouraged on all homework assignments. However, NO consultation with anyone is permitted for the exams or projects. Students are expected to come to class well prepared.
Attendance at all classes is recommended since the course work is cumulative. It is not
expected that assignments be completed during class hours. Assignments must be handed in on time!
Classroom Environment: The class schedule may be altered to accommodate students’
incoming level of programming experience. The class will consist of approximately 1-2 hours of lecture on the given topic, based on the examples in the text or in handouts provided by the instructor. At each class, students will also be asked to present homework problems and to discuss their respective solutions. Everyone is encouraged to ask questions - every question is important. The remainder of each period will consist of programming problems, using
knowledge gained from the lecture and the help of the instructor and fellow classmates.
ITEM BRIEF DESCRIPTION
SAS Homework Students will be required to complete all assigned problems and to submit the results (SAS code) by the assigned date. Students are expected to submit successful runs, unless they indicate and explain their errors/problems in their program(s). CONSULTATION with the instructor, other students etc, is STRONGLY encouraged
SAS Data Management Final Project
Students are expected to complete these independently. Specific questions about these items can be raised during class time.
Data management and SAS Programming Language
EPID576D
Grading/Student Evaluation: GRADING SCALE: A = 93 - 100% B = 85 - 92% C = 70 - 84% F = BELOW 70%Course Schedule: Dates of classes, topics, assignments, readings, and examinations - see attached
Academic Integrity: Students are expected toabide bythe University of Arizona Code of Academic Integrity found at http://w3.arizona.edu/~studpubs/policies/cacaint.htm.
Classroom Behavior: Students are expected to be familiar with the UA Policy on Disruptive Behavior in an Instructional Settingfound athttp://hr2.hr.arizona.edu/dos/pol_disrupt.htm and the Policy on Threatening Behaviorby Students found athttp://hr2.hr.arizona.edu/dos/pol_threat.htm.
COPH Grievance Policy: College of Public Health students who believe they have been subjected to unfair treatment in the administration of academic policies may seek resolution of their complaints through the College of Public Health Grievance Process found at
Data management and SAS Programming Language
EPID576D
-5-
BLOCK TOPIC SUBTOPICS BOOK HOMEWORK
1 Introduction to SAS Structure of SAS
DATA and PROC steps SAS Windows Environment Variable definitions
Syntax rules 2
Introduction to simple data input
SAS: Program Data Vector , Syntax Rules
INFILE statement INPUT statement LIBNAME statements
Problem 1
3 Data manipulation, conditional
processing and missing values
Creating new variables and transformations
IF/THEN/ELSE
Problem2
4 Simple SAS procedures MEANS, FREQ, UNIVARIATE,
TTEST, CORR, REG
5 SAS ARRAY computations ARRAY Problem 3
6 SAS FUNCTIONS Mathematical functions
Character functions Pseudo random number generator
Problem 4
7 SAS dates Definition and date and time
functions Problem 5
8 Restructuring data sets LAG and RETAIN Problem 6
9 SAS labels and formats PROC FORMAT, format library
10 SAS output and SAS graphics SAS ODS (Output delivery
system) RTF, HTM L, PDF formats
11 Data cleaning techniques Problem 7
12 Sorting, merging and update of
data sets
MERGE, UPDATE Problem 8
14 SAS engines and interfaces EXCEL, ACCESS, SPSS
STATA
Problem 9
15 Database management and
manipulation, table lookup
PROC SQL Problem 10
16 SAS macro variables and macro
programming
Improving programming and introduction to simulations
Problem 11
17 Introduction to matrix language SAS IML