Artur Borycki, Director International Solutions Marketing
18 March 2014
Teradata’s “Big Data”
Technology Strategy &
Roadmap
>
Introduction and level-set
>
Enabling the “Logical Data
Warehouse”
>
Any Data
>
Any Analytic
>
Virtual Compute
>
Summary & conclusions
Big Data
WHAT IS
BIG DATA IS NOT A
TECHNOLOGY
BIG DATA IS NOT THE
BIG DATA IS NOT A
BIG DATA IS NOT AN
BIG DATA IS A
MOVEMENT
DEMANDING MORE
CREATE A
SUCCESS
OPERATIONAL
CULTURAL
STRATEGIC
Data-Driven Business
⠕
View
⠕
Develop
⠕
Focus
⠕
Accelerate
⠕
Integrate
⠕
Measure
⠕
Empower
⠕
Build
⠕
Take
⠕
Value
⠕
Foster
⠕
Leverage
Enhanced customer experience
Process efficiency
New products/New business model
More targeted marketing
Cost reduction
Improved risk management
Monetize information directly
Regulatory compliance
Enhanced security capabilities
others
9
15
17
12
13
9
9
10
13
3
0 10 20 30 40 50 60 70
Percentage of Respondents
Likely to address (12-14 months)
Business issues now addressing
N = 465; multiple responses allowed
55
49
42
41
37
32
23
17
16
5
Big Data Adoption in 2013 Shows Substance Behind the Hype
DATA
INSIGHT
ACTION
•
Why
- Companies who exploit ALL their data achieve competitive advantage
•
How
– Implement an enterprise data architecture that includes three components: staging, discovery, and DW
Math and Stats Data Mining Business Intelligence Applications Languages Marketing ANALYTIC TOOLS & APPS
USERS
DISCOVERY PLATFORM
DATA WAREHOUSE
ERP SCM CRM Images Audio and Video Machine Logs Text Web and Social SOURCESDATA
PLATFORM
ACCESS
MANAGE
MOVE
Marketing Executives Operational Systems Frontline Workers Customers Partners Engineers Data Scientists Business AnalystsUnified Data Architecture
The four forces are leading to the rise of the “Logical
Data Warehouse”…
Big Data
Teradata’s technology strategy: enable the Logical Data
Warehouse, a.k.a.: “Unified Data Architecture”
• Structured, schemaless or name-value pair
Any Data
• Path, graph, affinity, time-series, text, etc., etc.
Any Analytic
• Transparent Orchestration of Analytic Services
throughout the Unified Data Architecture
Virtual Compute
• “1-click” data movement and management throughout
the Unified Data Architecture
Seamless data
synchronisation
• “Single pain of glass” admin; multiple moving parts
that look like one system (and manage themselves
Simplified Systems
Big Data
Schema on load
Key-Value Pair
Schema on read
“The Internet of Things” and the evolution of Information
Management
Increased ceremony (integrity, query performance)
Teradata’s Integrated Big Data Appliance is optimised
for set-based Analytics on structured data…
Contextual
Analytics
Resource
Flexibility
Always
On
Corporate
memory
•
Deep
analytics
•
Data Labs
•
Data refinery
•
Hadoop
integration
•
Ad hoc
projects
•
Peak
workload
assist
•
Disaster
recovery
•
High
availability
•
Archive
reporting &
retrieval
•
Audit and
compliance
…can support management and Analytics of
name-value pair data today…
Early
binding
Late
binding
Runtime
Load time
Data
Warehouse
Source
data
Schema
ETL
CLOB
Weblogs
SQL +
parse/extract
functions
BI
tools
…with native JSON support coming in Teradata
15.0
Color Size Prod_ID Create_Time
--- --- --- ---
Blue Small 96 2013-06-17 20:07:27
SELECT
box.MFG_Line.Product.Color
AS "Color",
box.MFG_Line.Product.Size
AS "Size",
box.MFG_Line.Product.Prod_ID
AS "Prod_ID",
box.MFG_Line.Product.Create_Time
AS "Create_Time"
FROM mfgTable
WHERE CAST(
box.MFG_Line.Product.Create_Time
AS TIMESTAMP) >= TIMESTAMP'2013-06-16 00:00:00'
AND
box.MFG_Line.Product.Prod_ID
= 96;
Need to manage and process large volumes of
file-based data? We have you covered…
Op mized hardware for Hadoop
BYNET
™
V5 40GB/sInfiniBand interconnect
Te
ra
d
at
a
V
it
al
In
fr
as
tr
u
ct
u
re
Teradata Distribu on for Hadoop
(Based on HortonworksHDP)
NameNode Failover
Intelligent Start and Stop
Teradata Connector for Hadoop (TDCH)
Aster and Teradata SQL-H
Teradata Studio with Smart
Loader
Teradata Viewpoint
One solution, Many uses
Contextual
Analytics
Corporate
memory
Resource
Flexibility
Always
On
Always
On
Raw data
Archival
data
Current
data
IDW data
years 1-5
IDW data
years 5-10
Unrefined
Multi-structured
data
Unrefined
structured data
Big Data
Need to move subsets of that data into the Exploration
& Discovery environment, without transformation?
SQL has been described as “Intergalactic Data
Speak”. It is the lingua franca of relational
database technology.
But relational theory assumes that ordering
doesn’t matter
- and support for iteration and
“relationship” Analytics is correspondingly weak in
SQL.
What if we could elegantly extend SQL to include
iterative styles of Analytics?
SELECT *
FROM nPath
(
ON (…)
PARTITION BY sba_id
ORDER BY datestamp
MODE (NONOVERLAPPING)
PATTERN ('(OTHER_EVENT|FEE_EVENT)+')
SYMBOLS (
event LIKE '%REVERSE FEE%' AS FEE_EVENT,
event NOT LIKE '%REVERSE FEE%' AS OTHER_EVENT)
RESULT (…)
) n;
Teradata-Aster: runs MapReduce, Speaks SQL
Graph Basics
•
Graphs model relationships between objects like
people, products, processes, bank accounts
•
Graphs are made up of “
vertices” or “nodes”
(entities) and lines called
“edges” (relationships)
that connect them
Navigational
Graph databases (Neo4J),
RDF/SPARQL (IBM, Oracle)
Two Major
Categories
of Graph
Technologies
Analytical
Graph engines (Aster,
Google, Hadoop Giraph)
Aster SQL-GR
™Engine
Built on a scalable BSP framework to enable Big Graph
Feature
•
Native graph processing
•
Massively scalable, not bound by memory limits
•
Pre-built graph functions
•
Integrated with SQL
•
Designed for Analytics
Benefits
•
Richer insights with powerful Graph processing
•
Large scale graph processing with best price
performance
•
Brings Graph processing to SQL audience
Teradata-Aster’s SNAP™ Framework will soon enable
more Analytic engines, more native data stores
SNAP
™
FRAMEWORK
INTEGRATED OPTIMIZER INTEGRATED EXECUTER UNIFIED SQL INTERFACE STORAGE SYSTEM AND SERVICES STATS TEXT
T
MAP REDUCESQL GRAPH
FILE STORE COLUMN STORE
ROW STORE
Big Data
HADOOP
TERADATA ASTER
DATABASE
ASTER
GRID
TERADATA
DATABASE
TERADATA
DATABASE
Remote, push-down
processing in Hadoop
•
Bi-directional data
movement
•
Leverage Hive query
language (push foreign
grammar)
•
Results returned to
Teradata for additional
processing
Teradata to Teradata
•
SQL sub-query sent to
Teradata Database
appliance
•
Additional processing
using data from
appliance in Teradata
IDW
Leverage SQL-MR
functions in Aster
•
Pass SQL-MR
syntax/grammar to Aster
•
Push local TD table for
remote processing
•
SQL-MR (e.g. nPath,
Sessionize) functions
executed in Aster
Leverage GRID
compute (SAS, Perl,
Python, Ruby, R)
•
Data streamed from TD
to GRID nodes for
processing
•
Isolates compute
resource use and
potential faults from
database
Virtual Compute Capability
Remote Processing On Hadoop
•
Query through
Teradata
•
Sent to Hadoop
through Hive
•
MapReduce processing
on Hadoop
•
Results returned to
Teradata
•
Additional processing
joins data in Teradata
•
Final results sent back
to application/user
•
Available in
Teradata 15.0!
Execute SQL-MR Functions In Aster
•
Query through
Teradata
•
SQL-MR request sent
to Aster
•
Sessionize function
performed in Aster
•
Results returned to
Teradata
•
Additional processing
using session results
in Teradata
•
Final results sent back
to application/user
•
Available in a future
release
Big Data
Teradata’s technology strategy: enable the Logical Data
Warehouse, a.k.a.: “Unified Data Architecture”
• Name-value pair operators (available now)
• JSON (Teradata 15.0)
•
Aster File System (Aster 6.0)
Any Data
• BSP-based Graph Engine (Aster 6.0)
• More Analytic engines coming to the Aster SNAP
framework soon
Any Analytic
• Fabric-Based Computing (available now – with further
enhancements & extensions planned)
• Transparent Orchestration (starting in Teradata 15.0)
Virtual Compute
•
Unity Data Mover & Unity Ecosystem Manager
(available now for multi-Teradata system
environments, support for Aster, Hadoop coming soon)
Seamless data
synchronisation
•
Viewpoint provides “Single pain of glass” management
and administration (available now
–
with further
enhancements & extensions planned)
Simplified Systems
Management &
The UDA provides cost-effective storage for
“any data”…
Why UDA Architecture Framework is
important
Hadoop
JSON Store
NoSQL Store
T
er
ad
at
a
Un
iv
er
se
2
0
1
4