• No results found

Comprehensive Analytics on the Hortonworks Data Platform

N/A
N/A
Protected

Academic year: 2021

Share "Comprehensive Analytics on the Hortonworks Data Platform"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

Comprehensive Analytics on the

Hortonworks Data Platform

We do Hadoop.

(2)
(3)

Back to 2005…

(4)

Vertical Scaling

RAM

CPU

Storage

(5)

RAM

CPU

Storage

Vertical Scaling

(6)

RAM

CPU

Storage

Vertical Scaling

(7)

Horizontal Scaling

RAM

CPU

Storage

(8)

Horizontal Scaling

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

(9)

RAM

CPU

Storage

Horizontal Scaling

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

(10)

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

RAM

CPU

Storage

Self Healing System

(11)

1 ° ° ° ° °

° ° ° ° ° N

HDFS

(Hadoop Distributed File System)

MapReduce

Hadoop 1.0

(12)
(13)
(14)

Hadoop 2.0

Clickstream Web

& Social

Geolocation Sensor

& Machine

Server Logs

Unstructured

SOURCES

Existing Systems ERP CRM SCM

ANALYTICS

Data Marts

Business Analytics

Visualization

& Dashboards

ANALYTICS

Applications Business

Analytics

Visualization

& Dashboards

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

HDFS

(Hadoop Distributed File System) YARN: Data Operating System

Interactive Real-Time

Batch Partner ISV

BatchMP Batch

P

EDW

(15)

Hortonworks Data Platform 2.2

YARN

: Data Operating System

(Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Apache Pig

° °

° °

° ° °

° ° °

HDFS

(Hadoop Distributed File System)

GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

Apache Falcon

Apache Hive Cascading Apache HBase Apache Accumulo Apache Solr Apache Spark Apache Storm

Apache Sqoop

Apache Flume

Apache Kafka

SECURITY

Apache Ranger

Apache Knox

Apache Falcon

OPERATIONS

Apache Ambari

Apache Zookeeper

Apache Oozie

(16)

Hortonworks: Hadoop for the Enterprise

We Do Hadoop

(17)

Who we are

2005

2011

24

900+

100%

5 out of 5

32.000

Apache Hadoop at Yahoo!

Inception of Hortonworks

Developers and Architects

Employees

Renewal Rate

Support Score*

Number of Nodes at Yahoo!

30+ Migrations

300+ Customers

Partner

600+

(18)

IN-MEMORY

HIGH-PERFORMANCE

ANALYTICS

BUSINESS INTELLIGENCE

DATA VISUALIZATION

DATA MANAGEMENT

Why SAS?

(19)

SAS can work with Hadoop, lifting data in a purpose-built advanced analytics

in-memory environment

SAS can treat Hadoop just as any other data source, pulling data from

Hadoop, when it is most convenient

SAS can work directly in Hadoop, leveraging the distributed processing

capabilities of Hadoop

SAS is the only vendor who supports all of these methods

(20)

SAS accesses and extracts data from Hadoop to a

SAS server for processing, and writes results back.

Bridge to traditional SAS environments

Hadoop treated as just “another data source”

Performance limited to single pipe bandwidth

DATA MOVEMENT

SAS + from Hortonworks

(21)

SAS accesses and processes Hadoop data on SAS Servers

while keeping the data and computations massively parallel.

Supports advanced analytics via shared computing

Allows the scaling of data storage and analytics separately

Ideal when analytical rigor, sophistication and governance are required

DATA LIFT INTO MEMORY

SAS + with Hortonworks

(22)

SAS processes data directly in the Hadoop cluster.

SAS LOGIC

SAS Embedded Process enables scalable SAS compute in Hadoop

SAS compute is orchestrated via Hadoop technology (YARN)

Data manipulation, data quality, and scoring support

Ideal when all data is landing in Hadoop, and Hadoop is the proper place for

processing

SAS + in Hadoop

(23)
(24)

About Rogers Media

–Great Brands

–Media advertising revenue a priority

–Audience Strategy the future

2013 CONSOLIDATED REVENUE BY SEGMENT (%)

(25)

AUDIENCE BUSINESS CHALLENGES

1. UNDERSTAND AUDIENCE

Having the largest volume of data sets, audience segments/profiles in Canada while leading the Canadian

marketplace in privacy and governance

3. ENGAGE AUDIENCE

Driving engagement across platforms and formats

2. FIND AUDIENCE

Being leaders in identifying and targeting audiences across channels, platforms and devices

4. MEASURE AUDIENCE

Exceeding client expectations with transparent reporting, the most accurate attribution models

(26)

AUDIENCE PLATFORM – THE DATA LAKE

- Land massive click stream log files:

- 100+ M records / day;

- 30 million unique IDs / month

- Cost effective / competitive

- Lean methodology

- Landed data always available if requirements should change

- Data definition on read

- Adoption of the Data Lake framework

(27)

more data

&

better algorithms

Summary

(28)

Hortonworks Jumpstart Package

Proposal for a simple production-ready

Hadoop cluster in one week

(29)

Hadoop is a Platform Decision

Adoption follows a consistent journey

Data architecture efficiencies, new analytic apps, and ultimately to a “data lake”.

HDP: A centralized architecture built on YARN

Any application, any data, anywhere.

HDP: A completely open data platform

Platforms are ultimately defined by open communities.

HDP subscription supports entire lifecycle

World class experience to ensure success from architecture to production to expansion.

(30)

Cautionary Statement Regarding Forward-Looking Statements

This presentation contains forward-looking statements involving risks and uncertainties.

Such forward-looking statements in this presentation generally relate to future events, our ability to increase the number of support subscription customers, the growth in usage of the Hadoop framework, our ability to innovate and develop the various open source projects that will enhance the capabilities of the Hortonworks Data Platform, anticipated customer benefits and general business outlook. In some cases, you can identify forward-looking statements because they contain words such as “may,” “will,” “should,” “expects,” “plans,”

“anticipates,” “could,” “intends,” “target,” “projects,” “contemplates,” “believes,” “estimates,”

“predicts,” “potential” or “continue” or similar terms or expressions that concern our expectations, strategy, plans or intentions. You should not rely upon forward-looking statements as predictions of future events. We have based the forward-looking statements contained in this presentation primarily on our current expectations and projections about future events and trends that we believe may affect our business, financial condition and prospects. We cannot assure you that the results, events and circumstances reflected in the forward-looking statements will be achieved or occur, and actual results, events, or circumstances could differ materially from those described in the forward-looking statements.

The forward-looking statements made in this prospectus relate only to events as of the date on which the statements are made and we undertake no obligation to update any of the information in this presentation.

References

Related documents

Abstract In this paper the well-known minimax theorems of Wald, Ville and Von Neumann are generalized under weaker topological conditions on the payoff function ƒ and/or extended

Forward-looking statements are only predictions that relate to future events or our future performance and are subject to known and unknown risks, uncertainties, assumptions, and

Forward-looking statements are only predictions that relate to future events or our future performance and are subject to known and unknown risks, uncertainties, assumptions, and

Forward-looking statements relate to future events or future performance and reflect New Dimension management’s expectations or beliefs regarding future events

The forward- looking statements included in this presentation may relate to, among others: (a) our growth strategies; (b) our future business development, financial condition

Forward-looking statements relate to future events or future performance and reflect Company management’s expectations or beliefs regarding future events and include, but are

This press release contains forward-looking statements, including our expectations with respect to our strategy and future growth prospects, including our ability to increase

The uniform spin dynamics from the weak to the strong spin-orbit coupling regime in the presence of linear β 1 and cubic β