DATA SCIENCE ADVISING NOTES
David Wild - updated May 2015
GENERAL NOTES
Lots of information can be found on the website at http://datascience.soic.indiana.edu.
Dr David Wild, Data Science Graduate Program Director, is the default advisor for all students. Email questions regarding advising should be sent to [email protected].
You can book an advising session in semester time (starting fall 2015) with David Wild on Tuesdays or Wednesday mornings at
https://djwild.youcanbook.me/. Residential students should meet in Informatics West 207 (901 E 1oth St); online students should provide a Skype or Google Hangouts username for an online session.
PATHS AND TRACKS
There is understandably some confusion about Paths and Tracks.
A path is not a formal part of your degree, but a guide to help you choose courses which are right for you. Courses are classed into Decision Maker or Technical paths. Decision maker courses are focused on the skills needed by data science decision makers such as utilization of data science techniques, social factors, and domain-specific applications. Technical courses are focused on the technical side of data science, often requiring strong programming skills. You do not have to “declare” a path for your degree, and being on one path doesn’t mean you can’t take courses from another path. In the course listings below, we tag the classes with paths (note that these tags are currently provisional, and some may change) A track is more formal specialization, and the track determines which courses you can take. The default is the general track, in which you can pick whichever courses you wish. Currently we have just one specialization track, computational & analytic, the requirements of which are listed at the end of this document. We intend to add more tracks in the future.
ONLINE COURSES - upcoming semester classes highlighted in green
Online classes are also available to residential students. Some online classes have physical classroom meetings for residential students, others are held completely online.
Course Next Class O=Online R=Residential Instructor Path / Specialization Track Notes
INFO I590 Data Science in Drug Discovery, Health and
Translational Medicine
Expected Spring 2016
Wild Decision Maker Website: http://dsdht.wikispaces.com
Prerequisites: Ability to perform basic statistical tasks in R; conceptual understanding of machine learning.
INFO I590 Management, Access, and Use of Big and Complex Data
Fall 2015 #34881 (O) #34879 (R)
Plale Decision Maker & Technical
INFO I590 Big Data Applications and Analytics
Fall 2015 #34717 (O) #15590 (R)
Fox Decision Maker Some programming experience required, Python preferred.
INFO I590 Big Data Open Source Software and Projects
Expected Spring 2016
Fox Technical Familiarity in Scripting Languages such as Linux Shell and especially Python. Knowledge of Java helpful but not required.
CSCI B649 Cloud Computing for Data Intensive Sciences
Expected Spring 2016
Qiu Technical
C&A
This is a programming intensive course. It has similar requirements to the CS graduate level residential version. Students are expected to have weekly (or biweekly) programming homework. General programming experience with Windows or Linux using Java (2-3 years) and scripts is required. A background in parallel and cluster computing is a plus, although not necessary.
CSCI B649 High Performance Computing
Fall 2015 #34732 (O) #11972 (R)
Sterling Technical Intermediate C/C++ experience
Familiarity with Linux/Unix command-line utilities
ILS Z636 Data Semantics Expected Spring 2016
Ding Technical Basic knowledge of HTML and XML is necessary. Basic knowledge of Java can be helpful.
ILS Z637 Information Visualization
Expected Spring 2016
Börner Decision Maker C&A
ILS Z604 Social and
Organizational Informatics of Big Data Fall 2015 #33198 (O) #33197 (R) Rosenbaum & Fichman Decision Maker
RESIDENTIAL COURSES - upcoming semester classes highlighted in green
Course Next Class Instructor Path /
Specialization Track
Notes
CSCI B503: Algorithms Design and Analysis
Fall 2015 #7461
Ergun Technical C&A
CSCI B534: Distributed Systems Technical
C&A
CSCI B551: Elements of Artificial Intelligence Fall 2015 #3171 Leake Technical C&A CSCI B552: Knowledge-Based Artificial Intelligence Technical
CSCI B553: Neural and Genetic Approaches to Artificial
Intelligence
Technical C&A
CSCI B555: Machine Learning Fall 2015 #30744
White Technical C&A CSCI B561: Advanced Database
Concepts Fall 2015 #12330 or #3172 Zhang Technical C&A
CSCI B565: Data Mining Fall 2015 #35008
Dalkilic Technical C&A CSCI B649: Advanced Topics in
Privacy
Decision Maker C&A
CSCI B652: Computer Models of Symbolic Learning
Technical
CSCI B656: Web mining Technical
CSCI B659: Information Theory and Inference
C&A
CSCI B661: Database Theory and System Design
C&A
CSCI B662: Database Systems & Internal Design
Technical C&A CSCI B669: Topics in Database
and Information Systems: Scientific Data Management and Preservation
Decision Maker
INFO I519: Introduction to Bioinformatics
Fall 2015 #8668
Ye Technical
C&A INFO I520: Security For
Networked Systems
Fall 2015 #31306
Camp Technical
C&A INFO I525: Organizational
Informatics and Economics of Security
Decision Maker
INFO I529: Machine Learning in Bioinformatics
Technical? C&A INFO I533: Systems & Protocol
Security & Information Assurance
Technical C&A
INFO I573: Programming for Science Informatics
Technical
Complex Networks and their Applications
INFO I590: Topics in Informatics: Applied Machine Learning
Technical
INFO I590: Topics in Informatics: Complex Systems
Technical
INFO I590: Topics in Informatics: Mining the Social Web
Fall 2015 #33588
Ferrara Decision Maker
INFO I590: Topics in Informatics: Relational Probabilistic Models
Technical C&A
INFO I590: Visual Analytics Technical
ILS P536: Advanced Operating Systems
Technical
ILS P538: Computer Networks Technical
C&A ILS Z511: Database Design Fall 2015
#6149
Bourlai Technical
ILS Z534: Information Retrieval: Theory and Practice
Fall 2015 #14383 or #15713
Liu / Guo C&A
ILS Z604: Topics in Library and Information Science: Data Curation
Technical
ILS Z604: Topics in Library and Information Science: Scholarly Communication
Information Science: Big Data Analysis for Web and Text ILS Z605: Internship in Library and Information Science
#6154 Fichman To be arranged with faculty advisor
ILS Z652: Digital Libraries #7433 Walsh Decision Maker
STAT S520: Intro to Statistics #13627 Luen C&A STAT S670: Exploratory Data
Analysis
#8932 King
STAT S675: Statistical Learning & High-Dimensional Data Analysis
#14446 Trosset
STAT S681: Statistical Network Analysis
COMPUTATIONAL & ANALYTIC TRACK REQUIREMENTS Requirements for the track:
1. A student has to take at least 3 courses (9 credits) from Category 1 Core Courses. CSCI B503 is required.
2. A student must take at least 2 courses from Category 2 Data Systems, AND, at least 2 courses from Category 3 Data Analysis. Courses taken in Category 1 can be double counted if they are also listed in Category 2 or Category 3.
3. A student must take at least 3 courses from Category 2 Data Systems, OR, at least 3 courses from Category 3 Data Analysis. Again, courses taken in Category 1 can be double counted if they are also listed in Category 2 or Category 3.
Category 1: Core Courses
CSCI B503 Analysis of Algorithms (Data analysis and Statistics) REQUIRED
CSCI B561 Advanced Database Concepts (Data Management and Infrastructure)
STAT S520 Introduction to Statistics OR (New Course) Probabilistic Reasoning (Data Analysis and Statistics) Category 2: Data Systems
CSCI B534 Distributed Systems (Data Management and Infrastructure)
CSCI B561 Advanced Database Concepts, (Data Management and Infrastructure) CSCI B662 Database Systems & Internal Design (Data Management and Infrastructure) CSCI B649 Cloud Computing (Data Management and Infrastructure)
CSCI B649 Advanced Topics in Privacy(Data Management and Infrastructure) CSCI P538 Computer Networks (Data Management and Infrastructure)
INFO I533 Systems & Protocol Security & Information Assurance (Application areas) ILS Z534: Information Retrieval: Theory and Practice (Data Analysis and Statistics) Category 3: Data Analysis
CSCI B565 Data Mining (Data Analysis and Statistics) CSCI B555 Machine Learning (Data Lifecycle)
INFO I590 Applied Machine Learning (Data Lifecycle)
INFO I590 Complex Networks and Their Applications (Data Management and Infrastructure) STAT S520 Introduction to Statistics (Data Analysis and Statistics)
Category 4: Elective Courses
CSCI B551 Elements of Artificial Intelligence (Data Lifecycle)
CSCI B553 Probabilistic Approaches to Artificial Intelligence Data Analysis and Statistics) CSCI B659 Information Theory and Inference Data Analysis and Statistics)
CSCI B661 Database Theory and Systems Design (Data Management and Infrastructure) INFO I519 Introduction to Bioinformatics (Application areas)
INFO I520 Security For Networked Systems (Data Management and Infrastructure) INFO I529 Machine Learning in Bioinformatics (Application areas)
INFO I590 Relational Probabilistic Models (Data Analysis and Statistics) ILS Z637 - Information Visualization (Data Analysis and Statistics)