Learn to Evolve
Disclaimer: This material is protected under copyright act AnalytixLabs ©, 2011. Unauthorized use and/ or duplication of this material or any part of this material including data, in any form without explicit and written permission from AnalytixLabs is strictly prohibited. Any violation of this copyright will attract legal actions.
WROX Certified Big Data Analyst Program
by AnalytixLabs and Wiley
About AnalytixLabs
•
80-20 focus on practical & theory•
Personal attention and Individual counselling•
Industry best practices•
World class course structure•
Surpasses industry requirements•
Cater to Standard certifications•
High quality course material and real life•
Seasoned analytics professionals•
Together we have 30 + years of experience with prestigious firms, like Kinsey, KPMG, Deloitte and AOL•
Regular sessions by industry experts•
Job-oriented training•
Lucrative job prospects in high growth domain•
Support for relevant certifications and diplomas•
Career counseling and planning•
Value for money with high return on investmentApproach
Content
Faculty
Bottom line
AnalytixLabs is a capability building and training solutions firm led by McKinsey, IIM and IIT alumni with
deep industry experience and a flair for coaching. We are focused at helping our clients develop skills in
basic and advanced analytics to enable them to emerge as “Industry Ready” professionals and enhance
their career opportunities.
2
Candidates trained by us are working in leading companies across
industries…
Global Big Data Talent Skill Gap
McKinsey Global Institute estimates a shortage of nearly 1.7 million big data talents by
2018. This includes a shortage of 140,000 to 190,000 workers with deep technical and
analytical expertise, and a shortage of 1.5 million managers and analysts equipped to
WCBDA program aims to provide its students an international, wide-spectrum qualification
for job-readiness and seamless absorption in Big Data job roles.
The program will expose the students and professionals to the roles of Big Data Analysts
who have:
Ability to think analytically
Understanding of storage, retrieval and mining of data
Explanatory analysis & predictive modeling skills
Possess Outcome-Oriented and Global Industry-Specific expertise in Critical Data
Analytics and Data Management Skills
Hands-on practical skills on Big Data tools R and Hadoop (MapReduce, Hbase, Hive,
Pig, Oozie, Sqoop, Mahout, ZooKeeper and Flume) and Data visualization - Tableau
Application of analytics in various domains, like Retail, Telecom, BFSI etc.
Skills to leverage analytics to drive smart business decisions
AnalytixLabs has collaborated extensively with industry experts to put together a
program that is rigorous, effective and relevant.
4
Program Objective
WCBDA is a comprehensive program and encompasses the following
modules, along with projects in the end
Analytics using R and Tableau – 42 hours + Practice
•
Introduction to the
R- environment
•
Data Input & Output
•
Data Manipulation
•
Visualization
•
Basic statistics
Big Data Hadoop – 30 hours + Practice
•
Introduction to Big Data
& Hadoop
•
Hadoop Architecture
•
MapReduce
•
R-Hadoop
•
Introduction to Flume
& Sqoop
Module 1
Module 2
Crafted by team of experts and maintains a balance between
theoretical concepts and practical applications
•
Advanced Analytics
•
Data Visualization using
Tableau
•
Machine Learning using R
•
Social Media Analytics using R
•
Applying overall Learning
•
PIG
•
HIVE
•
Hbase
•
Mahout
•
ZooKeeper
•
Misc Components
Introduction to R- environment • The Workspace
• Input/ Output
• Useful Packages (Base & other packages) in R • Graphic User Interfaces (R studio)
• Customizing Startup • Batch Processing • Reusing Results
Data Input & Output (Importing & Exporting)
• Data Structure & Data Types (Vectors, Matrices, factors, Data frames, and Lists)
• Importing Data (Importing data from csv, txt, Excel and other files)
• Keyboard Input (Creating input by entering data) • Database Input (Connecting to database and use the
data)
• Exporting Data (Exporting files into different formats) • Viewing Data (Viewing partial data and full data) • Variable & Value Labels – Date Values
• Missing Data
Business Analytics using R & Tableau
Duration: 42 hours + Practice sessions
6 Data Manipulation
• Creating New Variables (calculations & Binning) • Operators (Using multiple operators)
• Built-in Functions & User Defined Function
• Control Structures(conditional statements, Loops) • Sorting Data
• Merging and Appending Data • Aggregating Data
• Reshaping Data • Sub setting Data
• Data Type Conversions Visualization
• Creating Graphs
• Histograms & Density Plot
• Dot Plots – Bar Plots – Line Charts – Pie Charts – Boxplots – Scatterplots
Basic Statistics (Exploratory Analysis)
• Descriptive Statistics(central tendency/variance) • Frequency Tables /Summarization
• Hypothesis Testing
• t-tests/z-test (1-sample, independent sample, paired sample)
• Analysis of Variance(ANOVA) • Correlations/chi-square test
Advanced Analytics (Advanced Statistics)
•
Introduction to predictive modeling & applications•
Linear(Simple & Multiple) Regression•
Logistic Regression•
Introduction to segmentation•
Segmentation using cluster analysis Data Visualization using Tableau•
Introduction to Tableau & Environment•
Building basic views & sharing your work- overview•
Data importing & manipulation•
Maps/Tables/Calculated fields•
Parameters•
Data visualization with Charts maps•
Building & customizing Reports•
Building & customizing Dashboards Machine Learning using R•
What is Machine Learning?•
Applications of Machine Learning Algorithms•
Classification & Regression Problems•
Training & Testing concepts – Cost & optimization functions•
Artificial Neural Networks(ANN)•
Support Vector Machines(SVM)•
Decision Tress & Random Forest•
Baysian Network caseBusiness Analytics using R & Tableau
Duration: 42 hours + Practice sessions
Social Media Analytics using R
•
Social Media – Characteristics of Social Media•
Applications of Social Media Analytics•
Metrics(Measures Actions) in social media analytics•
Examples & Actionable Insights using Social Media Analytics•
Text Analytics – Sentiment Analysis using R•
Text Analytics – Word cloud analysis using R Projects (Applying overall Learning)•
Solve Business problems using R/TableauIntroduction to Big Data & Hadoop
•
What is Big Data?•
Types of Data•
Characteristics of Big Data•
Need for understanding Big Data (Application of Big Data)•
Traditional Approaches and its limitations•
Introduction to Hadoop and eco-system•
Getting Started with Hadoop (software installation etc.) Hadoop Architecture•
Hadoop Commercial version vs Apache Hadoop•
Hadoop Cluster in commodity hardware•
Hadoop core components•
HDFS layer•
HDFS operation principle•
Basic Hadoop commands MapReduce•
Introduction to MapReduce•
Hadoop MapReduce example•
Hadoop MapReduce Characteristics•
Setting up your MapReduce Environment•
Building a MapReduce Program•
Input Formats in MapReduce•
Output Formats in MapReduce•
Basic MapReduce Programming using RBig Data Hadoop
Duration: 30 hours + Practice sessions
8 R-Hadoop
• Introduction to RHdfs, Rmr and Rhbase
• Develop Map reduce code using R for Local & Hadoop env
• Exploratory analysis using R-Hadoop • Predictive analytics using R- Hadoop
• Overview of Parallelization using R without Hadoop Introduction to Flume & Sqoop
• Introduction to Sqoop (Why, what, processing, under the hood)
• Exporting data from Hadoop using Sqoop • Introduction to Flume
• Flume Use Cases
• Hands on Exercise using Flume and Sqoop PIG
• Introduction to PIG • Components of PIG • PIG Data Model
• Creating MapReduce programs using PIG • Hands on Exercise using PIG
HIVE
•
Introduction to HIVE and its characteristics•
Components of HIVE•
HIVE Data Models•
Serialization/De-serialization•
HIVE file formats•
HIVE Query Language•
HIVE Functions•
Difference between HIVE and PIG•
Hands on Exercise using HIVE HBase•
HBase introduction and its Characteristics•
HBase Architecture•
Storage Model of HBase•
When to use HBase•
HBase Data Model•
HBase Families•
HBase Components•
Data Storage•
Hands on Exercise using Hbase Mahout•
Mahout introduction and its Characteristics•
Mahout Architecture•
When to use Mahout•
What are the Machine Learning topics are covered in MahoutBig Data Hadoop
Duration: 30 hours + Practice sessions
ZooKeeper
• Introduction to ZooKeeper & its Features • Features of ZooKeeper
• Challenges faced in distributed applications • Coordination
• ZooKeeper: Goals and Uses
• ZooKeeper: Entities, Data Model, Services Misc Components
• Overview of Apache Oozie • Overview of Storm
• Overview of Apache Cassandra • Overview of Apache Spark • Overview of H2O
• Social Media Analytics(Text Analysis, Word cloud) Project (Applying overall Learning)
• Solve Business problems using all the components of Hadoop
Time and investment
Big Data Analytics: 72 hours + Practice, INR 32,000/ $700 (introductory price)
Certification Cost: INR 2000/ $50 (only applicable for WCBDA students)
Duration: 12 Weekend, 72 hrs live training - Saturday, Sunday (3 hours each) + Practice
Training mode: Fully interactive online class
(In addition to the above, you will also get access to the recordings for self study and practice)
Components: Content Resources Print and e-format, Simulations and Videos, Virtual Lab
with software and datasets, Industry- relevant project work
Certification: Participants will be awarded an International certificate on successful
completion of the stipulated requirements including an evaluation at the end
Placements: AnalytixLabs has an extensive Industry Network to facilitate Placements for its
students
12
Free career counseling and job assistance
•
Extensive Industry Network to facilitate Placements•
Identify available career options for an individual•
Resume writing and interview preparation•
Recommend additional skill set and training/course to enhance employability•
Structure and define a career path•
Estimate the economic goals and compensation trends•
Advice and design strategies to meet individual goalsWe provide trainings in a ‘fully interactive online’
Saves commuting time and resources in today’s chaotic world Saves commuting time and resources in today’s chaotic world Delivered lectures are recorded and can be replayed by individuals asper their needs Delivered lectures are recorded and can be replayed by individuals as
per their needs
One of strongest global trends in education, even in developing countries One of strongest global trends in education, even in developing countries Fully interactive live online class with personal
attention Fully interactive live online class with personal attention Access to quality training and 24x7 practice sessions available at the comfort of your place Access to quality training and 24x7 practice sessions available at the comfort of your place Studies prove that online education beats the conventional classroom Studies prove that online education beats the conventional classroom