Comprehensive Analytics on the
Hortonworks Data Platform
We do Hadoop.
Back to 2005…
Vertical Scaling
RAM
CPU
Storage
RAM
CPU
Storage
Vertical Scaling
RAM
CPU
Storage
Vertical Scaling
Horizontal Scaling
RAM
CPU
Storage
Horizontal Scaling
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
Horizontal Scaling
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
RAM
CPU
Storage
Self Healing System
1 ° ° ° ° °
° ° ° ° ° N
HDFS
(Hadoop Distributed File System)
MapReduce
Hadoop 1.0
Hadoop 2.0
Clickstream Web
& Social
Geolocation Sensor
& Machine
Server Logs
Unstructured
SOURCES
Existing Systems ERP CRM SCM
ANALYTICS
Data Marts
Business Analytics
Visualization
& Dashboards
ANALYTICS
Applications Business
Analytics
Visualization
& Dashboards
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS
(Hadoop Distributed File System) YARN: Data Operating System
Interactive Real-Time
Batch Partner ISV
BatchMP Batch
P
EDW
Hortonworks Data Platform 2.2
YARN
: Data Operating System(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Apache Pig
° °
° °
° ° °
° ° °
HDFS
(Hadoop Distributed File System)
GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
Apache Falcon
Apache Hive Cascading Apache HBase Apache Accumulo Apache Solr Apache Spark Apache Storm
Apache Sqoop
Apache Flume
Apache Kafka
SECURITY
Apache Ranger
Apache Knox
Apache Falcon
OPERATIONS
Apache Ambari
Apache Zookeeper
Apache Oozie
Hortonworks: Hadoop for the Enterprise
We Do Hadoop
Who we are
2005
2011
24
900+
100%
5 out of 5
32.000
Apache Hadoop at Yahoo!
Inception of Hortonworks
Developers and Architects
Employees
Renewal Rate
Support Score*
Number of Nodes at Yahoo!
30+ Migrations
300+ Customers
Partner
600+
IN-MEMORY
HIGH-PERFORMANCE
ANALYTICS
BUSINESS INTELLIGENCE
DATA VISUALIZATION
DATA MANAGEMENT
Why SAS?
SAS can work with Hadoop, lifting data in a purpose-built advanced analytics
in-memory environment
SAS can treat Hadoop just as any other data source, pulling data from
Hadoop, when it is most convenient
SAS can work directly in Hadoop, leveraging the distributed processing
capabilities of Hadoop
SAS is the only vendor who supports all of these methods
SAS accesses and extracts data from Hadoop to a
SAS server for processing, and writes results back.
Bridge to traditional SAS environments
Hadoop treated as just “another data source”
Performance limited to single pipe bandwidth
DATA MOVEMENT
SAS + from Hortonworks
SAS accesses and processes Hadoop data on SAS Servers
while keeping the data and computations massively parallel.
Supports advanced analytics via shared computing
Allows the scaling of data storage and analytics separately
Ideal when analytical rigor, sophistication and governance are required
DATA LIFT INTO MEMORY
SAS + with Hortonworks
SAS processes data directly in the Hadoop cluster.
SAS LOGIC
SAS Embedded Process enables scalable SAS compute in Hadoop
SAS compute is orchestrated via Hadoop technology (YARN)
Data manipulation, data quality, and scoring support
Ideal when all data is landing in Hadoop, and Hadoop is the proper place for
processing
SAS + in Hadoop
About Rogers Media
–Great Brands
–Media advertising revenue a priority
–Audience Strategy the future
2013 CONSOLIDATED REVENUE BY SEGMENT (%)
AUDIENCE BUSINESS CHALLENGES
1. UNDERSTAND AUDIENCE
Having the largest volume of data sets, audience segments/profiles in Canada while leading the Canadian
marketplace in privacy and governance
3. ENGAGE AUDIENCE
Driving engagement across platforms and formats
2. FIND AUDIENCE
Being leaders in identifying and targeting audiences across channels, platforms and devices
4. MEASURE AUDIENCE
Exceeding client expectations with transparent reporting, the most accurate attribution models
AUDIENCE PLATFORM – THE DATA LAKE
- Land massive click stream log files:
- 100+ M records / day;
- 30 million unique IDs / month
- Cost effective / competitive
- Lean methodology
- Landed data always available if requirements should change
- Data definition on read
- Adoption of the Data Lake framework
more data
&
better algorithms
Summary
Hortonworks Jumpstart Package
Proposal for a simple production-ready
Hadoop cluster in one week
Hadoop is a Platform Decision
Adoption follows a consistent journey
Data architecture efficiencies, new analytic apps, and ultimately to a “data lake”.
HDP: A centralized architecture built on YARN
Any application, any data, anywhere.
HDP: A completely open data platform
Platforms are ultimately defined by open communities.
HDP subscription supports entire lifecycle
World class experience to ensure success from architecture to production to expansion.
Cautionary Statement Regarding Forward-Looking Statements
This presentation contains forward-looking statements involving risks and uncertainties.
Such forward-looking statements in this presentation generally relate to future events, our ability to increase the number of support subscription customers, the growth in usage of the Hadoop framework, our ability to innovate and develop the various open source projects that will enhance the capabilities of the Hortonworks Data Platform, anticipated customer benefits and general business outlook. In some cases, you can identify forward-looking statements because they contain words such as “may,” “will,” “should,” “expects,” “plans,”
“anticipates,” “could,” “intends,” “target,” “projects,” “contemplates,” “believes,” “estimates,”
“predicts,” “potential” or “continue” or similar terms or expressions that concern our expectations, strategy, plans or intentions. You should not rely upon forward-looking statements as predictions of future events. We have based the forward-looking statements contained in this presentation primarily on our current expectations and projections about future events and trends that we believe may affect our business, financial condition and prospects. We cannot assure you that the results, events and circumstances reflected in the forward-looking statements will be achieved or occur, and actual results, events, or circumstances could differ materially from those described in the forward-looking statements.
The forward-looking statements made in this prospectus relate only to events as of the date on which the statements are made and we undertake no obligation to update any of the information in this presentation.