• No results found

DATA SCIENCE & DATA MANAGEMENT SYSTEMS

N/A
N/A
Protected

Academic year: 2021

Share "DATA SCIENCE & DATA MANAGEMENT SYSTEMS"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

In collaboration with:

A program by:

PG PROGRAM IN

FOR RECENT GRADUATES AND EARLY CAREER PROFESSIONALS

DATA SCIENCE &

DATA MANAGEMENT

SYSTEMS

(2)

I N T R O D U C T I O N

Every business, ranging from agriculture to technology, generates tons of data for every process handled.

In the words of Peter Drucker

The data generated by modern companies is an important asset that can be leveraged to make effective business decisions. Data Science enables businesses to draw meaningful insights from massive amounts of data. As organizations realise the importance of being data-driven in an increasingly digital world, there has been a sharp uptick in the demand for data scientists.

As modern business strategies get built on processed data, the gap between the number of qualified data engineers and the number of open positions has presented a unique opportunity for aspiring professionals to build lucrative careers.

The Post Graduate Program in Data Science and Data Management Systems is a 6-month intensive online program, designed by McCombs School of

Business Faculty, for Recent Graduates and Early Career Professionals.

The program starts with the very fundamentals of Data Science, and builds on the basics to give the learners a competitive edge in this domain.

The program focuses on Data Engineering and Data Analytics, both of which are the latest industry trends in the Data Science market.

The ability to manage huge data sets and draw actionable insights from them using tools like NoSQL, Apache Spark, HADOOP and Tableau is the top priority for businesses, who are looking for professionals specializing in data management and data analytics.

The program imparts in-demand Data Engineering and Data Analytics skills through a hands-on approach with practice lab sessions and industry oriented projects, enabling learners to gain industry experience before they join a high-growth career.

According to the U.S. Bureau of Labor Statistics, Data Science roles are expected to grow by more than 25% in the coming years. In a recent survey conducted by Analytics Insight, there will be more than 3 million new job openings in Data Science worldwide, and the average value of Data Science salaries is $110,000.

What gets

measured, gets

managed.

(3)

P R O G R A M B E N E F I T S

P R O G R A M S T R U C T U R E

MOST SOUGHT-AFTER DATA ENGINEERING AND DATA ANALYTICS TOOLS MENTORED LEARNING SESSIONS WITH EXPERTS

CASE STUDIES FOCUSING ON REAL WORLD DATA ENGINEERING SCENARIOS LAB SESSIONS AND PROJECTS FOR HANDS-ON EXPERIENCE

CAPSTONE PROJECT TO CONSOLIDATE YOUR LEARNINGS THROUGHOUT THE PROGRAM CERTIFICATE FROM THE UNIVERSITY OF TEXAS AT AUSTIN

The program will be delivered in an online format through recorded lectures and more than 45 hours of online, live-mentored learning sessions. The mentored learning sessions are conducted by industry experts, who will help you gain industry exposure. You will also work on projects to get a simulated experience of the real challenges faced by a data scientist.

6 MONTHS 5-HANDS ON PROJECTS

1 INTEGRATIVE

CAPSTONE PROJECT WEEKLY MENTORED

LEARNING SESSIONS

(4)

Source: Indeed

Industry Growth Hiring Companies

$349 Billion global spending

on Data Science in 2025.

Source: IDC Spending Guide

$110K average salary

for Data Science roles.

Source: Glassdoor

11.5 Million new Jobs

for Data Science professionals.

Source: US Bureau of Labour Statistics

28% annual growth

in Data Science jobs by 2026.

Source: US Bureau of Labor Statistics

W H O I S T H I S P R O G R A M F O R ?

PG Program in Data Science and Data Management Systems is designed to include practical hands-on skills across various tools and technologies that are in high demand and are critical for young career professionals to land their first job in the Data Science domain.

The program takes you through the entire Data Science value chain, which includes Data Management, Data Extraction, Exploratory Data Analysis, Data Visualization, leveraging Cloud Infrastructure, building Data Science Models, orchestrating Data Pipelines and presenting insights and solutions for critical business problems.

Our learners in PGP-DSDMS are early career professionals, with more than 85% of a cohort having less than 2 years of experience. They come from varied backgrounds like Information Technology, Banking, Pharma, Consulting and Research, with the drive to transform their career to the tune of the rapidly growing Data Science industry.

Industry Trends

(5)

Y O U R P E R S O N A L C A R E E R S U C C E S S T E A M

PGP in Data Science and Data Management Systems is dedicated to ensure the success of all participants, even beyond the lectures and curriculum. With the program, you will get access to GL-Excelerate, a career support program, exclusive to our PG Program learners.

• 1-on-1 Career Sessions - Personal interactions with industry professionals and access to Career Workshops to gain valuable insights and guidance

• Resume & Linkedin Profile Review - Present yourself in the best light through assets that truly showcase your strengths

• Interview Preparation - Get an insider’s perspective to understand what recruiters look for when hiring for Data Engineering and Data Analytics roles Apart from these services, you will have access to Career Workshop, where you will receive guidance on evaluating job opportunities, identifying your strengths and weaknesses, and preparing your elevator pitch for prospective employers.

You will also get a chance to appear for a Mock Interview, where you will get an opportunity to understand the expectations of recruiters, and receive personalized feedback on your performance.

With these tools, PG Program in Data Science & Data Management Systems enables you to take the right steps when it comes to your professional growth and career development.

(6)

C E R T I F I C A T E

The University of Texas at Austin

Conferred to attest that

has successfully completed the

June 2020

JOHN SMITH

Post Graduate Program in

Data Science and Data Management Systems

Gaylen Paulson

Associate Dean and Executive Director Texas Executive Education

Kumar Muthuraman, Ph.D, Faculty Director

Data Science and Data Management Systems Texas Executive Education

All certificate images are for illustrative purposes only.

The actual certificate may be subject to change at the discretion of the university.

Hands-on practice sessions using Popular Industry Tools

and more..

(7)

C O U R S E C U R R I C U L U M

ESSENTIALS OF COMPUTER SCIENCE

• Hardware

• OS

• Data Structures & Algorithms

• Programming

PYTHON FUNDAMENTALS

• Setup

• Variables

• Data Types

• Operators

• Functions

• Loops

• OOPS

LINUX FUNDAMENTALS

• Basics of OS

• Protocols and Networking

• Basic Linux Commands VERSION CONTROL

• Introduction to Git

• Features of Git

• Basic Commands

• GitHub

DESCRIPTIVE STATISTICS

• Measures of central tendency

• Measures of dispersion

COURSE 0: PRE-WORK

You will learn all the essentials of Computer Science, Programming and Statistics to build a strong foundation before you start your learning journey.

(8)

COURSE 1

COURSE 2

PYTHON FOR DATA SYSTEMS

(4 WEEKS)

SQL AND DATABASES

(4 WEEKS)

In this course, you will learn how to connect, extract and aggregate data present in various data sources, clean and perform Exploratory Data Analysis and derive meaningful insights using Python.

In this course, you will learn how to Query data in RDBMS and NoSQL DBs, design schemas and relationships between tables and automate data transformations in a database using Stored Procedures.

DATA PREPARATION

• Data Connection and Data Read

• Data Formatting

• Missing Value Treatment

• Dataframe Operations EXPLORATORY DATA ANALYSIS

• Graphs and Plots

• Univariate and Bivariate analysis

• Correlation WRANGLING

UNSTRUCTURED DATA

• Web Scraping

• Data Cleaning

• Exception Handling PROJECT 1

INTRODUCTION TO DBMS AND FUNDAMENTALS OF SQL

• Querying on SQL

• Functions

• Window Functions DATA MODELING AND ARCHITECTURE:

• ER Diagrams

• Schema Models

• Stored Procedures

• Views

NOSQL DATABASES

• File formats and comparison

• Introduction to MongoDB

• SQL operations PROJECT 2

(9)

COURSE 3

COURSE 4

DATA ANALYTICS ON CLOUD

(6 WEEKS)

DATA VISUALIZATION USING TABLEAU

(3 WEEKS)

In this course, you will learn how to orchestrate a data pipeline & navigate the AWS cloud infrastructure to leverage big data services to define and solve a business problem end-to-end from data requirements, to identifying drivers by formulating hypotheses, and finally present the insights in a markdown.

In this course, you will learn how to tell stories using data and create stunning dashboards with relevant visualizations to meet the business needs using Tableau.

INTRODUCTION TO CLOUD INFRASTRUCTURE

• Cloud9-IDE

• Cluster Compute Services

• Storage and Databases

AIRFLOW FOR DATA PIPELINE MANAGEMENT - PART 1

• Data Orchestration

• DAG

• Code Architecture

• UI

FOUNDATIONS OF STATISTICS

• Inferential Statistics

• Distributions

• Sampling

• CLT

• A/B Testing

HYPOTHESIS TESTING

• Interpreting p-values

• Errors

• Parametric Tests - t-Test Chi-Square Test

PROJECT 3

DATA, STORIES AND DASHBOARDING

• Visual Analytics

• Design Principles TABLEAU - A BI TOOL

• Architecture

• Data Preparation

• Calculations

• Actions

• Performance Optimization PROJECT 4

(10)

COURSE 5

Learn how to navigate and build solutions on the cloud by leveraging the Hadoop Ecosystem and use PySpark to compute huge volumes of data efficiently.

INTRO TO HADOOP AND BIG DATA ECOSYSTEM

• HDFS

• YARN

• SQOOP

• HIVE fundamentals DATA PROCESSING USING SPARK

• Hadoop vs Spark

• Spark Architecture

• Launch Modes

• RDDs

DATAFRAMES WITH SPARK SQL

• Dataframes

• Resource Allocation

• Partitioning

• Persistence

SPARK JOB OPTIMIZATION

• Memory Management

• Dynamic Allocation

• Compression

• Shuffle PROJECT 5

Learn how to apply industry relevant Data Science techniques such as Regression, Classification, Clustering, Dimensionality reduction, etc to solve real world problems.

SUPERVISED LEARNING PT. 1

• SL vs USL

• Regression

• Evaluation Metrics

SUPERVISED LEARNING PT. 2

• Classification

• Linear vs Logistic

• Decision Trees

• Confusion Matrix

UNSUPERVISED LEARNING

• K-means

• K-modes

• K-prototype

• Elbow curve

• Silhouette MODEL TUNING

• Bias Variance trade-off

• Underfitting vs Overfitting

• k-fold validation PROJECT 5

5 WEEKS

ELECTIVE A

BIG DATA ENGINEERING

ELECTIVE B

BUILDING DATA

SCIENCE MODELS

(11)

COURSE 6

COURSE 7

MODEL DEPLOYMENT

(SELF-PACED)

CORE STATISTICS

(SELF-PACED)

CAPSTONE PROJECT

(4 WEEKS)

In this course, you will learn model deployment techniques and make your model scalable, robust, and reproducible.

In this course, you will learn how to perform a variety of statistical tests and the math behind them.

A comprehensive project that encompasses a rigorous employment of all the tools and techniques you have learnt as a part of this program. Through expert assistance, learners would learn how to solve and manage real-world Data Science problems.

1. Model Deployment: Flask, Amazon SageMaker

2. Containerization using Docker: Productionalization 3. Container Orchestration: Kubernetes

1. Tests for Normality:

Shapiro-Wilk Test, Anderson-Darling Test, D’Agostino’s K2 Test

2. Parametric Tests:

ANOVA, ANCOVA, Paired Student’s t Test

3. Non-Parametric Tests:

Mann-Whitney U Test, Wilcoxon Signed-Rank Test, Kruskal-Wallis H Test

4. Tests for Correlation:

Pearson’s Correlation Test, Spearman’s Rank Correlation Test

This program also introduces you to advanced data science topics, which can be learnt at your own pace. These topics will bolster your understanding of Data Science, and will give you a competitive edge when applying for jobs and appearing for interviews.

(12)

COURSE PROJECTS

Here are a few sample projects to give you a glimpse into the program:

MOVIELENS DATA EXPLORATION

Industry Entertainment

Summary The GroupLens Research Project is a research group in the

Department of Computer Science and Engineering at the University of Minnesota. In this project, you will perform exploratory data analysis to understand the popularity trends of movie genres and derive patterns in movie viewership

Tools & Concepts

PERSONAL LOAN CAMPAIGN

Industry Banking

Summary You will build a model that helps to identify potential customers of a bank who have a higher probability of purchasing a loan

Tools & Concepts

CALL DROP ANALYSIS

Industry Telecom

Summary This project involves identification of the major reason for call drops for a Telecom company. A large volume of call record data is analysed using big data technologies to identify the reasons and provide

recommendations to improve the telecom services to customers.

Tools & Concepts

Supervised Learning, etc.

(13)

TAXI DEMAND PREDICTION

Industry Transportation

Summary Understanding taxi supply and commuter demand, especially the imbalance between the supply and the demand, would directly help to improve the quality of taxi service and eventually increase a city’s traffic system efficiency. As part of this project, you will use Python

& Big data tools to analyse the demand for the taxis at specific times in the day and also under specific weather conditions

Tools & Concepts

(14)

E - P O R T F O L I O

P R O G R A M F A C U L T Y

SHOWCASE YOUR SKILLS WITH AN E-PORTFOLIO

The E-Portfolio summarizes all the projects you will undertake and tools you will learn during the program, helping you to stand out from other applicants in the highly competitive Data Science industry.

Kumar Muthuraman, H. Timothy (Tim) Harking Centennial Professor, Faculty Director, Center for Research and Analytics, McCombs, University of Texas at Austin, M.S & Ph.D,

Stanford University

Dan Mitchell, Assistant Professor, McCombs School of Business Ph.D, The University of Texas at Austin

Ashish Agarwal, Assistant Professor, McCombs School of Business

Ph.D, Tepper School of Business, Carnegie Mellon University

View Sample E-Portfolio here

(15)

M E N T O R S

BECOME INDUSTRY-READY WITH LIVE MENTORSHIP

Along with strong theoretical foundations, hands-on learning goes a long way in preparing you to solve real-world business problems. As you work on real- life projects, you will receive personalised live mentorship every weekend from industry experts in Data Engineering and Data Analytics domains.

Ali Soleymani - Lead Data Scientist at Task Resource Ltd. | LinkedIn

Hossein Kalbasi - Data Engineer at Concured | LinkedIn

Matt Nickens - Manager, Data Science at The Walt Disney Studios | LinkedIn

Mohammad Amini - Data & Applied Scientist II at Microsoft | LinkedIn

(16)

A D M I S S I O N P R O C E S S

The admissions are conducted on a rolling basis and the admission process is closed once the requisite number of candidates has been enrolled

into the program.

Payable in installments. Pay upfront and get a 5% discount.

Avail to pay upto 6 interest-free installments.

Fill a simple online

application form

Wait for admission committee &

faculty panel to review your application

Get intimated if your profile gets shortlisted

PROGRAM FEE -

3, 200 USD

(17)

READY TO ADVANCE YOUR CAREER?

SPEAK TO A PROGRAM ADVISOR

+1 512 559 1644

Have questions about the program or how it fits in with your career goals?

email: [email protected]

APPLY NOW

References

Related documents

Players can create characters and participate in any adventure allowed as a part of the D&D Adventurers League.. As they adventure, players track their characters’

The aggregate benefits reflect the overall increase in cotton income from all 300,000 Burkina Faso cotton producers, the three national cotton companies (SOFITEX, Faso Coton, and

At the end of this lesson, students should know about the financial costs as well as the health and environmental impacts of diesel buses and car idling.. Students should be

Testy se zaměřují na porovnání výsledku trénování a klasifikace neuronových sítí při různém nastavení faktoru učení, různých počtech skrytých vrstev a různém

Conceptualizing advanced nursing practice: Curriculum issues to consider in the educational preparation of advanced practice nurses in

Central government organizations are defined according to the 2008 System of National Accounts (EC et al , 2009), which describes the central government subsector as

Someone who holds each type of card will, as the first two columns of Table 4 show, have approximately 5.6 percentage points lower checking account balances (measured relative to

This dual isomorphism is called by Serra ([20], Chapter 1) the morphological duality. In fact it is linked to what one calls Galois connections in lattice theory, as we will see at