• No results found

Data management and SAS Programming Language EPID576D

N/A
N/A
Protected

Academic year: 2021

Share "Data management and SAS Programming Language EPID576D"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Data management and SAS Programming Language

EPID576D

-1-

Time: Tuesday and Thursdays, 11:00am – 12:15 pm

Location: Drachman Hall A319

Instructors: Angelika Gruessner, MS, PhD 626-3118 (office)

Drachman Hall A224

[email protected]

Office Hours: Monday – Thursday 11-12.15pm

Course Description:

This course will introduce students to the fundamentals of data management using the SAS programming language and management system. Emphasis will be placed on the management of large data sets and data manipulation, including reading, processing, recoding, and

reformatting of data. In addition, this course will be an introduction to the SAS system. The approach will be to teach by example, with an emphasis on hands-on learning. The course will cover the specific topics listed in the course objectives below.

Each participant should afterwards be able to take and pass the ‘Global Basic SAS Programming’ certification.

Course Prerequisites:

Basic computer literacy, CPH 576A & CPH 573A or permission of the instructor

Course Objectives:

On completion of the course, students will be able to:

1. Build or import a SAS data set from raw data, spread sheets, or database systems

2. Reformat / recode data

3. Read, combine and restructure multiple data sets 4. Use SAS functions

5. Use SAS dates 6. Use SAS Arrays

7. Perform data cleaning procedures

8. Conduct ‘exploratory data analysis (EDA)’ using SAS 9. The ODS system and SAS graphics

10. Integrate SAS Programming with Data Management and Statistical Analyses

11. Usage of SQL for data import and manipulation

12. Develop SAS macros

(2)

Data management and SAS Programming Language

EPID576D

Course Notes:

Students will be given a copy of the power point presentation and handouts for that day. Students may also be given additional examples of specific SAS programs which will also be discussed in class. Students are responsible for ALL material distributed during the semester. Competencies:

• Ability to identify appropriate statistical tools to address specific scientific questions

• Demonstrate excellent presentation skills and the ability to explain statistical concepts and findings to a general scientific audience

• Demonstrate advanced working skills in application of computer systems and appropriate statistical software

• Demonstrate understanding of methods of data analysis and data monitoring

It is essential to have an USB key to save all work

Recommended Texts/Readings:

o Delwiche,L.D. and Slaughter, S.J., The Little SAS Book, A Primer; Fourth Edition; The SAS Institute, 2009

o Geoff Der and Brian S. Everitt: A Handbook of Statistical Analyses Using SAS, CRC Press, 2009

o Cody, R, and Pass, R. SAS Programming by Example, The SAS Institute, 2007

o A. J. Bailor, Statistical Programming in SAS, The SAS Institute, 2010

(3)

Data management and SAS Programming Language

EPID576D

-3-

Course Requirements

:

Student Expectations: The only way to learn programming is to practice; in fact most programmers learn by example. This is therefore a hands-on class in which students are

expected to complete ALL tasks in a timely manner. Interaction among the students and with the instructor is STRONGLY encouraged on all homework assignments. However, NO consultation with anyone is permitted for the exams or projects. Students are expected to come to class well prepared.

Attendance at all classes is recommended since the course work is cumulative. It is not

expected that assignments be completed during class hours. Assignments must be handed in on time!

Classroom Environment: The class schedule may be altered to accommodate students’

incoming level of programming experience. The class will consist of approximately 1-2 hours of lecture on the given topic, based on the examples in the text or in handouts provided by the instructor. At each class, students will also be asked to present homework problems and to discuss their respective solutions. Everyone is encouraged to ask questions - every question is important. The remainder of each period will consist of programming problems, using

knowledge gained from the lecture and the help of the instructor and fellow classmates.

ITEM BRIEF DESCRIPTION

SAS Homework Students will be required to complete all assigned problems and to submit the results (SAS code) by the assigned date. Students are expected to submit successful runs, unless they indicate and explain their errors/problems in their program(s). CONSULTATION with the instructor, other students etc, is STRONGLY encouraged

SAS Data Management Final Project

Students are expected to complete these independently. Specific questions about these items can be raised during class time.

(4)

Data management and SAS Programming Language

EPID576D

Grading/Student Evaluation: GRADING SCALE: A = 93 - 100% B = 85 - 92% C = 70 - 84% F = BELOW 70%

Course Schedule: Dates of classes, topics, assignments, readings, and examinations - see attached

Academic Integrity: Students are expected toabide bythe University of Arizona Code of Academic Integrity found at http://w3.arizona.edu/~studpubs/policies/cacaint.htm.

Classroom Behavior: Students are expected to be familiar with the UA Policy on Disruptive Behavior in an Instructional Settingfound athttp://hr2.hr.arizona.edu/dos/pol_disrupt.htm and the Policy on Threatening Behaviorby Students found athttp://hr2.hr.arizona.edu/dos/pol_threat.htm.

COPH Grievance Policy: College of Public Health students who believe they have been subjected to unfair treatment in the administration of academic policies may seek resolution of their complaints through the College of Public Health Grievance Process found at

(5)

Data management and SAS Programming Language

EPID576D

-5-

BLOCK TOPIC SUBTOPICS BOOK HOMEWORK

1 Introduction to SAS Structure of SAS

DATA and PROC steps SAS Windows Environment Variable definitions

Syntax rules 2

Introduction to simple data input

SAS: Program Data Vector , Syntax Rules

INFILE statement INPUT statement LIBNAME statements

Problem 1

3 Data manipulation, conditional

processing and missing values

Creating new variables and transformations

IF/THEN/ELSE

Problem2

4 Simple SAS procedures MEANS, FREQ, UNIVARIATE,

TTEST, CORR, REG

5 SAS ARRAY computations ARRAY Problem 3

6 SAS FUNCTIONS Mathematical functions

Character functions Pseudo random number generator

Problem 4

7 SAS dates Definition and date and time

functions Problem 5

8 Restructuring data sets LAG and RETAIN Problem 6

9 SAS labels and formats PROC FORMAT, format library

10 SAS output and SAS graphics SAS ODS (Output delivery

system) RTF, HTM L, PDF formats

11 Data cleaning techniques Problem 7

12 Sorting, merging and update of

data sets

MERGE, UPDATE Problem 8

14 SAS engines and interfaces EXCEL, ACCESS, SPSS

STATA

Problem 9

15 Database management and

manipulation, table lookup

PROC SQL Problem 10

16 SAS macro variables and macro

programming

Improving programming and introduction to simulations

Problem 11

17 Introduction to matrix language SAS IML

References

Related documents

Description: In the age of big data, scientific data analytics work increasingly needs to involve with a huge volume of observation data, which should be manipulated,

To maximize global optimization while retaining a local feel, Barnevik adopted a matrix structure, based on regional company operations and business areas.. Global optimization

As you may recall, last year Evanston voters approved a referendum question for electric aggregation and authorized the city to negotiate electricity supply rates for its residents

coproduction and coproductive capacity in the context of climate change adaptation in the health and water sectors in Cambodia, as identified through interviews with key

simultaneity, fragmentation, contamination and constraint predict greater negative work- to-family spillover; in other words, the temporal conditions that have emerged in today’s

The use of the emergency released vapor (0.2MPa/120ºC) can generate an emergency 2Mwe.. The production cost of the nuclear electricity. In calculating the reduced costs of production

wikipedia says "cloud computing is a phrase used to describe a variety of computing concepts that involves a large number of computers connected through

For a given value of the slider, the percentage of records that had response time greater than the value of the slider is converted into a color using d3’s scale function and