• No results found

Industry-Driven Master Certificate in

N/A
N/A
Protected

Academic year: 2021

Share "Industry-Driven Master Certificate in"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

Industry-Driven

y

Master Certificate in

Data Science

Data Science

(2)

O tli

Outline

External Contexts

Academic Program Design Approach

Analysis of the LinkedIn Data

Analysis of the LinkedIn Data

Mix of Competences

Academic programs at the University of Perugia

Conclusion and Further Activities

Bari 12-11-2015

(3)

E t

l C

t

t

External Contexts

Bari 12-11-2015

(4)

E t

l C

t

t

External Contexts

Gartner Hype Cycle for emerging technologies

2014 2015

source: http://www.gartner.com/newsroom/id/3114217

Bari 12-11-2015

(5)

E t

l C

t

t

External Contexts

Focused Market Analyses

Current

Momentum

Bari 12-11-2015

Source: SCM World + Big Data Analytics, Mobile Technologies And Robotics Defining The Future Of

(6)

E t

l C

t

t

External Contexts

Retail

Financial Services

Healthcare

Healthcare

Energy

Telcos

W b/S

l/M d

Bari 12-11-2015

Web/Social/Media

Government

Manufacturing

(7)

E t

l C

t

t

External Contexts

Related academic initiatives worldwide

School Degrees

Undergraduate

George Mason University B.S.

Illinois Institute of Technology B.S. Certificate

Oxford University Adv. Diploma

Masters

Bentley University M.S.

Carnegie Mellon

iSchool @ Syracuse Grad Cert. 5 courses

Rice University Cert.

Stanford University Grad Cert.

Masters g

DePaul University MS.

Georgia Southern University M.S. 30 cr

University of California San Diego Grad Cert. 6 courses

University of Washington Cert.

Ph.D

Louisiana State University businessanalytics.lsu.edu/ M.S. 36 cr

Illinois Institute of Technology Masters 4 courses

Michigan State University M.S.

George Mason University Ph.D.

IU SoIC Ph.D

g y

North Carolina State University: Institute for Advanced Analytics M.S.: 30 cr.

Northwestern University M.S.

New York University M.S. 1 yr

Stevens Institute of Technology M S : 36 cr

Stevens Institute of Technology M.S.: 36 cr.

University of Cincinnati M.S.

University of San Francisco M.S.

UC Berkeley M.S.

Bari 12-11-2015

(8)

A d

i P

D i

A

h

Academic Program Design Approach

Analysis of data

Interaction and coordination with interested companies.

Analysis of the existing academic programs and similar projects

Reference profile from IWA Italy Chapter, “DATA SCIENTIST” -

p

y

p

June 30, 2014.

http://www.skillprofiles.eu/stable/g3/profiles/WSP-G3-024.pdf

Bari 12-11-2015

(9)

A

l i

f th Li k dI d t

Analysis of the LinkedIn data

Thousands of job offers

C

f

Data Collection and keyword identification: php, JAVA, Ruby,

Machine Learning, Statistics, Information Retrieval, Graph

Analysis, Big Data, Hadoop, hBase, Hive Pig, Scala, HTML, SQL,

Python, MATLAB, Data Mining, Bachelors degree, BS degree, MS

degree, M.S, B.S., PhD, Applied Mathematics, Computer Science,

database, text mining, SAS, neural network, NLP, Computer

database, text mining, SAS, neural network, NLP, Computer

Science, Electrical Engineering, Operations Research, Linux, Unix,

Teradata, C/C++, Spark.

Dimensionality reduction through PCA

E t

ti

f th

t i ifi

t

t

Bari 12-11-2015

(10)

A

l i

f th Li k dI d t

Analysis of the LinkedIn data

PCA results

0.7 0.8 0.9 ai ne d 70% 80% 90% d (% ) 0.4 0.5 0.6 on al Va ria nc e Ex pl 40% 50% 60% V ar ian ce E xpl ai ne d 0.1 0.2 0.3 Fr ac ti 10% 20% 30% To ta l 1 2 3 4 5 6 7 8 9 10 0 Principal Component 0%

 The two most significant components cannot capture most of the process energy.  The two most significant components cannot capture most of the process energy.  The third component is necessary for collecting a significant percentage of energy.  The remaining energy is still significant. It shows the heterogeneous nature of Data

Bari 12-11-2015

(11)

A

l i

f th Li k dI d t

Analysis of the LinkedIn data

PCA results

0.4 0.6 Scala 0 0.2 0.4 Computer Science Data Mining Statistics SAS Scala PhD HTML Unix Ruby Operations Research MATLAB Electrical Engineering php Applied Mathematics B.S. Ph.D. Information Retrieval Bachelors degree BS degree MS degree M.S NLP text mining Pig neural network Graph Analysis hBase Spark Hive Machine Learning big data ponent 3 -0.4 -0.2 SQL database Python JAVA Linux C/C++ Bachelors degree Teradata M.S neural network Graph Analysis Hadoop Com p -0.6 -0.4 -0.2 0 -0.6 -0.4 -0 2 -0.6 Bari 12-11-2015 0.2 0.4 0.6 0.2 0 0.2 0.4 0.6 Component 2 Component 1

(12)

MIX f C

t

 Computational System (concurrency and distributed systems)

 Programming Languages and Paradigm (C R OpenMPI Python)

MIX of Competences

Programming &

Digital Technologies

 Programming Languages and Paradigm (C, R, OpenMPI, Python)  Tools (Big Data, Cloud Platforms, Databases, Sensors,…)

Domain-specific &

 Data sources (Open and Linked Data) curation  Standards and Certification for the domain

p

Analysis skills

 Interpretation skills (Knowledge extraction)

 Marketing and Market Analysis (Innovation leadership)

Business-orientation

& C

i

i

 Marketing and Market Analysis (Innovation leadership)  Legal and Ethical elements

 Data Visualisation and Communication Skills

& Communication

 Statistics and Probability; Algebra and Calculus

 M hi L i

Bari

12-11-2015

Maths and Statistic

competences

 Machine Learning

 Data Mining And Business Intelligence

(13)

D t S i

A d

i

Data Science: Academic programs

at the University of Perugia

at the University of Perugia

Master of Science degree in Computer Enginnering and Robotics

Data Science Program

Focus on data engineering and analytics with applications to business intelligence

R b ti P

Robotics Program

Focus on cloud robotics driven by machine learning

Post-graduate International, Industry Driven Master Certificate in Data Science

Master Certificate in Data Science

Focus on Finance, Energy & Utilities, Telcos, Social & Media, Industry/Manufacturing,

Bari 12-11-2015

Services

(14)

D t il d

f th M t

C tifi t

Detailed program of the Master Certificate

Designed to be completed in

Hours ECTS

F d l f C S i 14 2  Designed to be completed in

12 months,

over 1500 hours of lectures,

Fundamentals of Computer Science 14 2 Fundamentals of Statistics 14 2 Big data Processing and Tools 28 4

, lab experiments, business focused seminars, individual and group study and project

Business Intelligence 21 3 Cloud Computing 28 4 Data Science Tools 71 8

design.

Lectures in English

Networking and Data Security 28 4 Data Mining and Machine Learning 52 6 Visual Analytics 40 5

International guest lecturers, from academy, industry,

service providers.

Comunication and Presentation Skills 21 3 Data-driven Marketing 21 3

Seminars focussed on Finance, 42 6 p Energy & Utilities, Telcos, Social &

Media Industry/Manufacturing Services Stage in Industries 225 9 Bari 12-11-2015 Stage in Industries 225 9 Final project preparation 150 6

(15)

C

l i

d F th

A ti iti

Conclusion and Further Activities

Design of the program in Data Science at the University of

Perugia, tailored to the Industry Needs.

g ,

y

Corresponding PhD research activities

Continuous monitoring of results and expectations

Candidate to become champion in the Edison Project

http://edison-project.eu/

References

Related documents

Hadoop/Map Reduce Based Big Data Processing Systems: Google’s Big Table, Facebook HBase, Hive, Apathe Pig Latin, Key Value Store Systems: MongoDB, Cassandra, Cloud

The technologies used in Hadoop by big data application to handle the massive data are Hdfs, Map Reduce, Pig, Apache Hive, Hbase and Spark.. These technologies handle

 Hands-on practical skills on Big Data tools R and Hadoop (MapReduce, Hbase, Hive, Pig, Oozie, Sqoop, Mahout, ZooKeeper and Flume) and Data visualization - Tableau.  Application

Example 1: Klout data architecture Serving Stores Signal Collectors (Java/Scal a) Data Warehouse (Hive) Klout.com (Node.js) Event Tracker (Scala) Mobile (ObjectiveC) Analytic s

You will work on real-world projects in Hadoop Dev, Admin, Test, and Analysis, Apache Spark, Scala, AWS, Tableau, Artificial Intelligence, Deep Learning, Python for Data Science,

The core capability of Hadoop has now grown to include a full framework of tools that include a data warehouse infrastructure (Hive), parallel computation capabilities (Pig),

NameNode name node name node name node name node d at a n o d e Hbase Storm Hive Pig Map Red YARN Tez Sqoop Knox Spark Kafka Ambari Agent. HADOOP ARCHITECTURE

Scoobi Scaldin g Kafka Storm Impala Web Services • Python: Flask • Scala: Spray Visualization tools.. Page 35 Scoobi Scaldin g Kafka Storm Impala Web Services • Python: Flask