• No results found

Transactions & Interactions

N/A
N/A
Protected

Academic year: 2021

Share "Transactions & Interactions"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

Transactions & Interactions

The Correlation of Structured and Unstructured Data

Shaun Connolly, Hortonworks

(2)

Big Data Has Reached Every Market

Source: McKinsey & Company report. Big data: The next frontier for innovation, competition, and productivity. May 2011.

Digital data is personal, everywhere, increasingly

accessible, and will continue to grow exponentially

(3)

What is Apache Hadoop?

Page 3

Set of open source projects

Transforms commodity hardware

into a service that:

– Stores petabytes of data reliably

– Allows huge distributed computations

Solution for big data

– Deals with complexities of high

volume, velocity & variety of data

Key attributes:

– Redundant and reliable (no data loss)

– Extremely powerful

– Batch processing centric

– Easy to program distributed apps

– Runs on commodity hardware

One of the best examples of

open source driving innovation

(4)

Page 4

Yahoo!, Apache Hadoop & Hortonworks

http://www.wired.com/wiredenterprise/2011/10/how-yahoo-spawned-hadoop

Hadoop at Yahoo!

40K+ Servers 170PB Storage 5M+ Monthly Jobs 1000+ Active Users

Yahoo! embraced Apache Hadoop, an open source platform, to

crunch epic amounts of data using an army of dirt-cheap servers

2006

Yahoo! spun off 22+ engineers into Hortonworks, a company focused on

advancing open source Apache Hadoop for the broader market

(5)

HCatalog Zookeeper Hive

Pig Hadoop

Core

HBase

Challenge:

•  Integrate, manage, and support

changes across a wide range of open source projects that power the Hadoop platform; each with their own release schedules, versions, & dependencies.

•  Time intensive, Complex, Expensive

Solution: Hortonworks Data Platform

•  Integrated certified platform distributions

•  Extensive Q/A process

•  Industry-leading Support with clear

service levels for updates and patches

•  Continuity via multi-year Support and

Maintenance Policy

Hortonworks Data Platform

Fully Supported Integrated Platform

= New Version

(6)

Advancing Hadoop for Broader Market

Architecting the Future of Big Data

Hortonworks Focus

Transform Apache Hadoop into a complete Data Platform with the data, application, and operational services that enable a

vibrant ecosystem driving the next wave of business innovation and productivity

Operations

Hortonworks Data Platform

Platform APIs Administration APIs

(7)

Advancing Hadoop for Broader Market

Page 7

Replication, DR Retention, ILM

ETL (basic & advanced)

Integration (msg bus, …)

Datastore Federation SQL, NewSQL, NoSQL

Tools, Languages Algorithms, Data Science Search

Analytics, EDW

SaaS, Packaged & Custom Apps BI, Reporting, Visualization

Operations Hortonworks Data Platform

Platform APIs Administration APIs

Hortonworks Data Platform

(8)

Big Data Value Creation Opportunities

Financial Services

•  Detect/prevent fraud

•  Model and manage risk

•  Improve debt recovery rates

•  Personalize banking/insurance products

Healthcare

•  Remote patient monitoring

•  Predictive modeling for new drugs

•  Personalized medicine

•  Optimal treatment pathways

Retail

•  In-store behavior analysis

•  Cross selling, recommendation engines

•  Optimize pricing, placement, design

•  Optimize inventory and distribution

Web / Social / Mobile

•  Sentiment analysis

•  Web log, image, and video analysis

•  Location-based marketing

•  Price comparison services

Manufacturing

•  Design to value

•  Improve service via product sensor data

•  Crowd-sourcing

•  “Digital factory” for lean manufacturing

Government

•  Detect/prevent fraud

•  Segment populations, customize action

•  Support open data initiatives

•  Cyber-security

(9)

© 2011 Datameer, Inc. All rights reserved. Page 9

•  Business Intelligence Platform on Hadoop

•  Established 2009 by Hadoop and enterprise software veterans

•  Offices in Silicon Valley, New York and Germany

•  Funded by Kleiner Perkins, Caufield and Byers + Redpoint Ventures

Datameer

(10)

© 2011 Datameer, Inc. All rights reserved. Page 10

Leading Financial Institution

§  Log file analytics involving disparate systems

§  Ensures smooth operation during “mini-crises”

§  Also provides visibility into visitors’ click path

§  1000’s of operational servers, 10+ formats

§  Hadoop as long term data store: 200+ TB

§  Using Datameer for ingestion, analysis,

and data quality metrics

§  Datameer unifies JSON, XML, IIS, Apache, and

proprietary log formats into company standard

§  Datameer integrated with Active Directory

(11)

© 2011 Datameer, Inc. All rights reserved. Page 11

Major Social Gaming Pioneer

§  Top tier Facebook-based gaming company

§  Offering: strategy games to millions of monthly users

§  Looking to aggregate different data sources such as:

game play and twitter sentiment

§  Datameer aggregates logs from hundreds of games

servers for gameplay behavioral analytics, drives existing dashboards

§  Hourly/daily/weekly metrics available to game

producers, sliced across 10+ dimensions, establishing high-value cohorts to optimize games and cross-sell / up-sell

§  Close monitoring of the performance of their

(12)

© 2011 Datameer, Inc. All rights reserved. Page 12

Seamless Data Integration! Powerful Analytics! Self-Service Dashboards!

•  Interactive spreadsheet UI"

• Cleansing, transformation, analysis"

•  Over 180 built-in analytic functions"

•  Macros and function plug-in API"

•  Drag and drop"

•  Powerful visualizations"

•  Mashup anything"

•  Integration into existing portals"

•  Wizard-based integration"

•  Structured, semi- and unstructured"

•  No complex mappings/schemas"

•  Connector plug-in API"

(13)

© 2011 Datameer, Inc. All rights reserved. Page 13

Enterprise Integration

(14)

© 2011 Datameer, Inc. All rights reserved. Page 14

Datameer Analytics Solution

Use Case Example

(15)

© 2011 Datameer, Inc. All rights reserved. Page 15

For more information: www.datameer.com or www.hortonworks.com

Live Hortonworks Webinars: What’s in Store for Hadoop.Next

Sign up at: www.hortonworks.com/webinars

Live Datameer Webinars: Datameer Analytics Solution “Below Decks” with Datameer

Sign up at: http://datameer.com/news-events/events.html

For information on Datasift, please go to www.datasift.com.

References

Related documents

Specification (3) moreover adds control variables that indicate whether localities absorbed other communities in voluntary and forced mergers during the reform up to year t and

Authentication protocols for the IoT may be improved in terms of (1) addressing both the authentication and privacy problem, (2) developing efficient IDSs, (3) improving the

Currently, a scalable simulation-based back-end, a back-end based on symbolic ex- ecution, and a formal back-end exploiting functional equivalences be- tween a C program and a

If all procedures in a weld procedure qualification program were based on manual welding processes (e.g., shielded metal arc welding [SMAW]), any major over-allowance on

Apache Hadoop is an open-source software framework written in Java platform that offers an efficient and effective method for storing and processing massive amounts of

Oracle Big Data Appliance runs Oracle Linux and is based on Cloudera’s Hadoop Distribution and includes Apache Hadoop with Cloudera Manager, and an open source distribution

Participants in the Circle of Care program receive multiple services, including case management, a family assessment, parenting classes, home visits from the case manager,

To assess the acceptability of our mobile expressive robot head by elderly people, we retain several criteria: qualitative evaluation of the emotions expressed by the robot, impact