• No results found

An interdisciplinary model for analytics education

N/A
N/A
Protected

Academic year: 2021

Share "An interdisciplinary model for analytics education"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

An interdisciplinary model for

analytics education

Raffaella Settimi, PhD

School of Computing

,

(2)
(3)

NPSMA Workshop, May28-29, 2013

Drew Conway’s Data Science Venn Diagram

(4)

Who is a data scientist?

 ….and let’s not forget …

a subject-domain expert

 Curious and inquisitive

 A computer scientist

 A statistician

 A data miner

 Creative and with strong

communication skills

(5)

Learning outcomes of an Analytics MS curriculum

Database

Processing

Programming

/Scripting

Algorithms and

data modeling

Visualization

and

communication

Applications

Hands on

experience

Domain- specific

competence

NPSMA Workshop, May28-29, 2013

(6)

Core

subjects

NSPMA Workshop, May28-29, 2013

• SQL queries

• DB programming

• NoSQL DB’s (e.g. Hadoop)

Database Processing

• Data cleaning, integration, and governance • Association Rules

• Basic Statistics & Data visualization

Data Mining

• Multivariate statistics • Time series analysis

Statistical Analysis

• Classification techniques, • Clustering methods

• Supervised and unsupervised learning Machine Learning

(7)

Tools and

Platforms

NPSMA Workshop, May28-29, 2013

Programming • Python, Java

Data Storage and integration

• Relational Databases (MySQL, Oracle, SQL Server) • Hadoop, NoSQL, Mongo DB

Modeling and analysis

• R

• SPSS & SPSS Modeler

• SAS & SAS Enterprise Miner • Matlab

• Weka

• PANDA (Python Data Analysis Library)

Visualization

• Tableau • MapPoint • ArcView, etc… Students should master a

core set of tools and platforms for

- Data storage and integration

- Modeling and analysis - Data visualization and

reporting

Both open source and commercial software

(8)

DePaul’s MS degree in Predictive Analytics

Originally a specialization in

Machine Learning of our MS

in Computer Science.

Created in 2010 to address

increasing

demand

of

graduates

with

deep

technical and analytics skills

to meet the challenge of

mining Big Data.

0 10 20 30 40 50 60 70 80 90 100 AY 2010/11 AY 2011/12 AY 2012/13

Enrollment

(9)

From a 2012 survey of our students

Students are most interested in courses around working with data (including “big data”) and data analysis,

as well as gaining additional experience in programming and marketing.

*Created on Wordle.net. Size is relative to the

overall number of mentions/responses; position does not matter. Top 50 words/mentions shown.

(10)

Current Positions held by our students

Analytics Positions

Not Related Position

Breakdown of industries among students with analytics positions

50% 20%

30%

Not working full time

B ank ing C on su lt in g Ed u ca ti on Food /B ev er ag e In su ra n ce IT /T echn ol og y M ar ke ti n g /Ad ve rt is in g N ot f or P rof it H ea lt h C ar e N /A

(11)

DePaul’s MS in Predictive Analytics curriculum

NPSMA Workshop, May28-29, 2013

Common

core

Computational Methods concentration (Fall 2011) Marketing concentration (Fall 2010) Hospitality concentration (Fall 2013) Health Care concentration (Winter 2014) Prerequisite knowledge: Intro to Statistics Python

Calculus & Liner Algebra (can be taken before MS)

Practicum Course

(12)

Links to Curriculum

Course home page:

http://www.cdm.depaul.edu/academics/Pages/MS-in-Predictive-Analytics.aspx

Concentrations:

– Computational methods:

view requirements

– Marketing:

view requirements

– Hospitality:

view requirements

– Health Care: available in winter 2014

(13)

Common Core

Teaches the fundamental tools and techniques for Data Science.

• Database processing (SQL queries, relational databases,

noSQL DB’s, data management and integration)

• Statistical modeling (regression analysis, multivariate

statistics)

• Data mining and machine learning (data cleaning, association

rules, clustering, classification techniques, etc…)

• Application of analytics in social networks, web data mining,

text mining

(14)

Applications

NSPMA Workshop, May28-29, 2013

• Analysis of network structure, Data retrieval from networks, Text analysis

Social Networks

• User behavior modeling, E-metrics for business intelligence, Web personalization,

recommender systems, privacy and ethical issues

Web analytics

• Information retrieval models, document clustering, taxonomies, sentiment analysis. Text mining

(15)

Additional electives

• Image analysis: image representation, segmentation, pattern

recognition

• Monte Carlo techniques

• Visualization techniques and design principles

• Data stream analysis

• ETL, data warehousing and business intelligence tools

(dashboards, reporting, etc…

(16)

Computational Methods concentration

view requirements

Created in response to the demand of those students who wanted to

develop strong technical skills required for Big Data analytics.

Courses in

– Mining Big Data

– Programming analytics applications in Python

– Advanced data mining techniques (matrix factorization, probabilistic networks, etc.)

– Machine Learning algorithms

Students learn how to apply advanced data mining and data base

processing techniques for the analysis and management of extremely

large datasets.

(17)

Marketing concentration jointly with

the Marketing Department

view requirements

Everyday the amount of data available to businesses increases, and more

information is available about markets, products, competitors and customers. Companies gain a competitive advantage by using analytics to uncover

insights about their markets and make smarter decisions. Courses in

– Customer Relationship Management – Marketing analytics

– Internet marketing

– Customer service and analysis Students learn how to

– Apply analytics to mine marketing data

– Extract information from data to support business decision making and marketing decisions.

(18)

Hospitality concentration jointly with

the School of Hospitality Leadership

view requirements

Organizations in the tourism (hotels, restaurants , travel) industry have

access to an abundance of data, both internal and from third-party

available through social media channels, such as Trip Advisor and Yelp.

Students learn how to

– Apply analytics to mine hospitality data incorporating revenue

management principles, and optimization techniques

– Assess hospitality global distribution system analytics and

predict impacts on service-firm financial performance

– Identify revenue management principles and optimization

models unique to the various services sector within the

hospitality industry

(19)

Health Care concentration jointly with

the Marketing dept. and Health Sector Mgmt program

The recent changes in healthcare have lead to a paradigm shift in

healthcare industry and an increasing need in using data to predict

trends in illness, disease, injury, utilization, and costs.

Students learn how to

Apply analytics to mine health care data such as

– Patient experience / satisfaction/outcomes

– Claim management and cost reduction

– Predictive modeling of care, costs and utilization

– Pharmacy data

• To develop evidence-based business models to improve health care

strategies, such as patient experience, clinical processes, and

resource allocation.

(20)

NPSMA Workshop, May28-29, 2013

In 2010, we created an interdisciplinary academic center to bring together expertise of faculty from different schools and programs at DePaul University:

– Computing – Marketing

– Hospitality leadership (added in 2012)

– Health sector management (added in 2013)

 Aimed to be a “center without walls” facilitating:

 Faculty and students’ research across disciplines

 State-of-the-art curriculum for preparing a new generation of specialists in data mining and predictive analytics

 Faculty and students’ collaborations with industries

 Students/Alumni matching application with employers’ needs

 Networking events

(21)

Provide students with real world

experience: Think outside the classroom

• Data science cannot be learnt just by sitting in a

classroom and listening to lectures

• Students should

– Use real data in courses

– Work on large scale projects

– Gain experience through internships or industry sponsored

projects

– Have access to a variety of platforms and tools

– Network with analytics professionals

(22)

Challenge: Access to real data

It can be hard to get real data from companies,

because data often contain sensitive information about

the company or customers.

Internships are easier to set up, as data remains at the

company site.

Industry-sponsored projects are a win-win opportunity

for companies that can take advantage of a team of

students and the expertise of a faculty member

supervising the project - at no or relatively low cost .

(23)

DaMPA Industry Partnerships

DaMPA Industry Partnerships

Software Education

Data

Research Innovation Board Education Advisory Board

Research

Companies provide datasets to be used for class projects or student research projects.

Companies recruit students, serve as industry advisors, and guide the Center on curriculum development and long terms planning.

Companies provide software or training material to be used for teaching or research. Partner to translate new science into novel technologies and to address unmet industry’s needs.

(24)

Examples of projects

 Medical Informatics

 NSF REU program in medical informatics (joint with University of Chicago)

 Computer-aided detection, diagnosis, and characterization for lung nodules (joint

with University of Chicago)

 Prediction of chronic fatigue syndrome (joint with DePaul Psychology Department)  Tracking illness from Tweets

 Analysis of legionellosis occurrence (data from Chicago Public Health Office)

 Web Data Mining, Web Personalization, and Recommender Systems

 Ontology-based user modeling for web personalization and recommendation  Recommender Systems for the Social Web

 Trustworthy and Secure Recommender Systems for the Web

 Urban studies

 Motor Vehicle theft analysis (data from Chicago Police Dept.)

 A data-driven typology of urban communities in Cook County (joint with Institute

of Housing Studies)

 Hospitality Projects

 Food and Beverage Analytics and Optimization Modeling

(25)

Where are our graduates employed?

Internships or full time

References

Related documents

Переривчасте шліфування застосовується для зменшення нагрівання поверхні, що шліфується за рахунок періодичного переривання її контакту з колом,

Jedno specifièno znanje, koje je izloženo u Kur’anu stoljeæima prije njegovog nauènog otkriæa, vezano je za sastav atmosfere. Sada je poznato da, što se više penjemo prema

Prior to commencing the assay, dilute Wash Solution, prepare specimen samples as described in point 5.3 and establish carefully the distribution and identification plan supplied in

Competitive Intelligence and Benchmarking Cost Modeling and Cost Forecasting Country Entry Strategy Consumer Survey & Feedback Reports Market Research & Business

University Example : How do different funding structures affect the composition of Purdue University’s research workforce, as identified by the University’s STAR METRICS Level I

announcement of retaliatory Russian measures against European food products, Prime Minister Orban called for a re-think of the EU’s sanctions, stating “The sanctions policy pursued

Furthermore, the Western Australian Indigenous population has one of the lowest rates of participation in the private sector labour market, reflecting to some extent the high rate