Biology BSC 6932
Applied Regression for Scientists Fall 2014
Instructor: Dr. Leah Johnson Department: Integrative Biology Office: SCA
Phone:
E-mail: [email protected]
Office Hours: by appointment. I am very happy to meet with students but would rather do it at a time that is convenient for both us. Drop me an email or catch me before or after class if you would like to to set up a time.
Class Meeting Times and Locations: M,W from2-3:30 p.m. in SCA 222 Course Status: permission of instructor required, intended for graduate students Suggested Textbooks:
A Modern Approach to Regression with R, by Simon Sheather, 1st Edition, Springer ISBN-13: 978-0387096070
A Primer of Ecological Statistics, by Gotelli and Ellison, 2nd Edition. Sinauer Associates, Inc. ISBN-10: 0-87893-269-0
Course Description and Goals:
This course is a practical introduction to data analysis and modeling aimed for scientists. In the course you’ll learn about how to think about modeling data, how to design experiments to collect data, and how to choose appropriate statistical methods and models for those data. Further, you will obtain a good foundation to learn and use sophisticated modern statistical methods in the future. The emphasis is on using regression – the core of modern statistics – to analyze a variety of real world problems, via data. As you learn statistical topics you’ll also be introduced to a powerful, open source statistical programming language called R (http://cran.us.r-project.org/). Learning by actually doing analyses will be emphasized. It is assumed that students will have taken a basic undergraduate stats course (so you would know what a p-value is, for instance) and should also know some calculus – this will not be a mathematically advanced class and won’t cover a lot of statistical theory. More advanced statistical techniques will be covered in a follow-on course. Textbooks:
There is no required text for this course. Everything you need will be in the course notes and sup-porting material. However, if you want a book (which I suggest), there are two recommended texts. First is A Modern Approach to Regression with R, by Simon Sheather. This is a useful reference and has more detail than we will see in class. It is a bit advanced, and covers additional topics that will serve you well in future. Further, the book website ( http://www.stat.tamu.edu/ sheather/book) has R-code to accompany the text as well as translation of most procedures into both STATA and SAS. For those wanting something a bit more approachable, and that includes experimental design, philosophy of science and statistics, and data management, I recommend A Primer to Ecological
ecology. It also covers much of the material focusing on a conceptual understanding with less mathematical detail.
Format:
This course will be taught as a “flipped” or inverted course. That is the “lecture” component (i.e., the introduction to material) will occur outside of class. Material equivalent to less than two chapters of reading will be assigned to be completed before each class session. Quizzes to be completed online to help in learning the material will also be assigned. During class meetings some of this material will be briefly reviewed as short lectures. However, most of the in class time will be allocated to examples, exercises, discussions of content, and other activities. In addition to learning about statistics and experimental design, you will learn about techniques for data and code management, and you will become familiar with the open source R statistical programming language. Students are encouraged to ask questions during the class meetings and fully participate in all discussions and in-class exercises and assignments.
Attendance and Participation:
Attendance is expected but is not mandatory. You are all adults and can make your own decisions about the benefits of attendance. However, it is my opinion that attendance will improve your learning of the material. Further, a portion of your course grade is based on participation (see below). Any work that is due on the date of an absence must be handed in prior to the date of the absence, or at a later date with permission of the instructor. Students are responsible for all material covered in class regardless of attendance. Participation is expected from each student during class. The quality and frequency of contribution to class discussions, debates, exercises and assignments will determine your participation grade at the end of the semester.
Course Topics and Timing:
Below is the tentative schedule of topics for each week.
Aug 25: Intro; visualization, data management and curation and Intro to R
Sept 1: Labor Day holiday; Univariate distributions, likelihoods, parameter inference Sept 8: Correlation and ANOVA; The simple linear regression (SLR) model
Sept 15: More SLR; Inference and estimation for SLR Sept 22: Regression diagnostics and data transformations Sept 29: Multiple linear regression (MLR)
Oct 6: More on MLR, dummy variables and interactions Oct 13: Review; Midterm
Oct 20: Model selection and data mining Oct 27: Time series models and autoregression Nov 3: Binary (logistic) regression and classification Nov 10: Generalized Linear Models (GLMs)
Nov 17: The Bootstrap; Monte Carlo Simulation
Nov 24: Putting it all together: designing experiments and the art of scientific modeling Dec 1: Project 2 focus time; Final Review
Dec 6 - 12: Final Exam Acknowledgements:
I would like to thank Dr. Robert Gramacy from the University of Chicago and Dr. Jason Rohr from USF who provided source material used in the lectures of this course.
Intellectual Property:
Electronic recording of lectures is not permitted in this course and students are not permitted to takes notes or record lectures by any means for the purpose of sale.
Academic Dishonesty:
The University and Divisions does not tolerate academic dishonesty and punishment will be im-posed for academic dishonesty of any kind (see Undergraduate Catalogue for University guidelines on punishment). “Cheating is academic dishonesty. Cheating is defined by the University as (1) unauthorized granting or receiving of aid during the prescribed period of a course-graded exercise (students may not consult written materials such as notes or books, may not look at the paper of another student, nor consult orally with any other student taking the same test); (2) a students asking another person to take an examination for or in place of him/her; (3) taking an examination for or in place of another student; (4) stealing visual concepts, such as drawings, sketches, diagrams, musical programs and scores, graphs, maps, etc., and presenting them as ones own; (5) stealing, borrowing, buying, or disseminating tests, answer keys, other examination materials, research pa-pers, creative papa-pers, speeches, other graded assignments, etc., except as officially authorized; (6) stealing or copying of computer programs and presenting them as ones own (including the use of another students program, as obtained from the magnetic media or interactive terminals or from cards, print-out paper, etc.). Engaging in plagiarism is academic dishonesty, even though a student may plagiarize without any intent to be dishonest. Plagiarsm is defined by the University as liter-ary theft, consisting of the unattributed quotation of the exact words of a published text, or the
unattributed borrowing of original ideas by paraphrase from a published text. An informative, dis-cussion of plagiarism may be found at: http://www.tarleton.edu/~mkerr/Avoid Plagiarism.htm (also see the current Undergraduate 4 Catalogue).” “Punishment for academic dishonesty: Divi-sion guidelines for punishment are based on University guidelines (see the current Undergraduate Catalogue). For cheating, punishment is based on the seriousness of the offense. (1) For observation of or exchanging information with other students during the course of a classroom examination, the students who receive or give such information shall receive an F with a numerical value of zero on the test, and the “F” shall be used to determine the final course grade. It is the option of the instructor to fail the student in the course and assign and “F” or “FF” (the latter indicating academic dishonesty) grade for the course. (2) For the use of another student, a stand-in, to take an examination for an enrolled student, the enrolled student shall receive an “F” or “FF” in the course. (3) For stealing, borrowing, or buying of research papers, creative works, examinations and other test materials, or other graded assignments, or the dissemination of such materials, or the manipulation of recorded grades in a grade book or other class records, the student shall receive an “F” or “FF” in the course. For plagiarism, the student shall receive an “F” with a numerical value of zero on the item submitted, and the “F” shall be used to determine the final course grade. It is the option of the instructor to fail the student in the course and assign and “F” or “FF” grade for the course. For the use of a cheat sheet or any prohibited electronic device during the course of a classroom examination to assist the student or other students, the student using such prohibited device shall receive an “F” or “FF” in the course.”
Canvas: Students must visit the BSC 6932 site on Canvas within the first week of class. Canvas will be an integral part of this course. Materials associated with class will be disseminated through Canvas. Canvas can be accessed at: https://my.usf.edu. If you have questions about Canvas, first check the list of frequently asked questions located on the USF website.
Computer and Network Access:
The use of Canvas must be consistent with the agreement that you signed to obtain a NetID. According to this agreement, students will not (1) provide access to the Universitys network and computing resources to any other person or entity; (2) access another user’s account and/or mis-represent one’s identity; (3) allow another person to access their accounts or share their passwords; (4) use computing resources for private profit not related to university activities; (5) intentionally impede the legitimate use of computing facilities by other persons; (6) use computing resources for junk mail or mass mailing; (7) violate any law, regulation, or contract; (8) publish information that is threatening, harassing, abusive, defamatory or libelous; (9) publish or distribute illegally copied music, movies, software or other Intellectual Property, or otherwise infringe upon the copyrights of other persons or entities; (10) publish any information or software used to circumvent software licensing or registration; (11) advocate or solicit violence or criminal behavior; or (12) use comput-ing resources to generate private profit not related to university activities.
Special Accommodations:
Students needing special assistance for any reason should not hesitate to contact the instructor, and should do so within the first two weeks of the course. Necessary accommodations will be provided given that the instructor is notified in advance and the student provides a Memorandum of Accommodations from the Office of Student Disability Services. Accommodated examinations
through the Office of Student Disability Services requires a minimum of two weeks notice. Religious Observances:
Students who anticipate the necessity of being absent from class because of the observation of a major religious observance must provide notice of the date(s) to the Instructor, in writing, by the second class meeting.
Excused Absences:
Excused absences are medical (individual or immediate family only; documented), legal (accident or court case; individual only; documented), funerary (immediate family only; documented), military (call to active duty; documented), religious (customarily-observed holidays; absence pre-arranged with instructor), and special requirements of other courses and University-sponsored events (exact nature of such requirements and events is unspecified; absence pre-arranged with instructor). In-Class Behavior:
Please be respectful of other students during class. Arrive to class and lab on time. If you do arrive to class tardy, please enter the classroom as quietly as possible. Turn off cell phones and pagers before entering the classroom or the lab. Cell phone usage will not be tolerated during class for any reason. Refrain from talking while someone has the floor but please do not hesitate to participate when questions are asked. This course will be taught in a computer lab. Please refrain from checking email, surfing the web, etc. during class.
In the Event of an Emergency:
In the event of an emergency, it may be necessary for USF to suspend normal operations. During this time, USF may opt to continue delivery of instruction through methods that include but are not limited to: Canvas, Elluminate, Skype, and email messaging and/or an alternate schedule. Its the responsibility of the student to monitor Canvas for each class for course specific communication, and the main USF, College, and department websites, emails, and MoBull messages for important general information.