compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
1Section 9 : Case Study #
Objectives of this Session
The Motivation For Hadoop
What problems exist with traditional large-scale computing systems
What requirements an alternative approach should have
How Hadoop addresses those requirements
Hadoop: Basic Concepts
What Is Hadoop?
The Hadoop Distributed File System (HDFS)
How Google MapReduce Algorithm works
Anatomy of a Hadoop Cluster
Who uses Hadoop ?
db.suven.net
# Not a part of 1Z0-061 or 1Z0-144 Certification test , but very important technology in BIG DATA Analysis
• Hadoop Solutions
– The most common problems Hadoop can solve
– The types of analytics often performed with Hadoop
– Where the data comes from ?
– The benefits of analyzing data with Hadoop
– How some real-world companies use Hadoop
• Hadoop Ecosystem
• Cloudera Software (All Open-Source)
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
2Objectives of this Session … contd…
The Motivation For Hadoop
compiled by Rocky Jagtiani Tech Head for 3 SCTPL , 9892544177
*
MPI: Message Passing Interface
PVM: Parallel Virtual Machine
compiled by Rocky Jagtiani Tech Head for 4 SCTPL , 9892544177
Major
Problem
compiled by Rocky Jagtiani Tech Head for 5 SCTPL , 9892544177
1 GB = 1000 MB , 1 TB = 1000 GB , 1 PT = 1000 TB , 1 Exabyte = 1000 PT
PT => petabyte , TB => teraByte
compiled by Rocky Jagtiani Tech Head for 6 SCTPL , 9892544177
compiled by Rocky Jagtiani Tech Head for 7 SCTPL , 9892544177
The Motivation For Hadoop
compiled by Rocky Jagtiani Tech Head for 8 SCTPL , 9892544177
1.
2.
compiled by Rocky Jagtiani Tech Head for 9 SCTPL , 9892544177
3.
4.
5.
10 compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
Hadoop History
compiled by Rocky Jagtiani Tech Head for 11 SCTPL , 9892544177
Core Hadoop Concepts
compiled by Rocky Jagtiani Tech Head for 12 SCTPL , 9892544177
Hadoop Components
compiled by Rocky Jagtiani Tech Head for 13 SCTPL , 9892544177
HDFS
compiled by Rocky Jagtiani Tech Head for 14 SCTPL , 9892544177
HDFS
Concepts
compiled by Rocky Jagtiani Tech Head for 15 SCTPL , 9892544177
HDFS : How Files Are Stored ?
compiled by Rocky Jagtiani Tech Head for 16 SCTPL , 9892544177
How Files Are Stored: Example
compiled by Rocky Jagtiani Tech Head for 17 SCTPL , 9892544177
IMP :
How MapReduce Work ?
compiled by Rocky Jagtiani Tech Head for 18 SCTPL , 9892544177
MapReduce: The Mapper
19 compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
Example :
compiled by Rocky Jagtiani Tech Head for 20 SCTPL , 9892544177
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 21
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 22
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 23
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
24compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 25
Anatomy of a Hadoop Cluster :
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 26
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 27
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
28Who uses Hadoop ?
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 29
Hadoop Solutions
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 30
A
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 31
B What is Problem if the data is coming ?
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 32
C
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 33
The most common problems Hadoop can solve :
We understand how each problem is solved using Hadoop in brief
D
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 34
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 35
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 36
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 37
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 38
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 39
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 40
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 41
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 42
How some real-world companies use Hadoop
E
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 43
Hadoop Ecosystem
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 44
Cloudera Software (All Open-Source)
compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177 45
*enterprise data
warehouse (EDW)
Conclusion :
1) Input to mapper is
"Google is one of the richest companies "
"one who works with the Google is technical expert "
what will be the out put after reducing ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
46
Questions
2) Input to mapper is
"Cat is eating milk"
"Cat is very sweet and she likes milk"
"milk is in bottle"
what will be the out put after reducing ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
47
3) Input to mapper is
"Dollar is national currency for USA"
"Rupee is national currency for India"
"Dollar is ahead of Rupee in economy"
"India is developing country"
what will be the out put after Mapping ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
48
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177