• No results found

Big Data Analytics Using SAP HANA Dynamic Tiering Balaji Krishna SAP Labs SESSION CODE: BI474

N/A
N/A
Protected

Academic year: 2021

Share "Big Data Analytics Using SAP HANA Dynamic Tiering Balaji Krishna SAP Labs SESSION CODE: BI474"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Data Analytics Using SAP HANA Dynamic Tiering

Balaji Krishna – SAP Labs

(2)

How Dynamic Tiering reduces the TCO of HANA

solution

Data aging concepts using in-memory and

on-disk storage

Single Install/Admin/Monitoring

(3)

IDC predictions for 2014

Data explosion

Data volumes will continue to explode to 6 billion petabytes

Social networking

Social networking will become embedded in cloud platforms and most enterprise apps and processes

Cloud

Cloud spending will surge by 25%, reaching over $100 billion. There will be a doubling of cloud data centers.

Internet of Things

30 billion devices, sensors in 2020 – driving $8.9 Trillion in revenue Mobile CRM Data Planning Opportunities Transactions Customer Sales Order Things Instant Messages Demand Inventory

Big Data

Sales Order

Things

Mobile Demand Big Data CRM Data Customer PlanningTransactions

(4)

SAP End to End Data Management for Real Time Business

Business & Consumer Applications

Big Data

SAP DATA MANAGEMENT

STORE

TRANSACT ANALYZE PREDICT

Custom

Development ISVs & OEMs ERP

Internet of Things Workforce of the Future Cloud Industries

(5)

e

SAP HANA platform

Processing Engine

Application Function Lib. & Data Models Integration Services

SAP HANA PLATFORM

Real-time transactions + end-to-end analytics

Operational Analytics

Big Data Warehousing

Predictive, Spatial & Text Analytics

REAL-TIME ANALYTICS

Sense &

Respond OptimizationPlanning & Consumer Engagement

REAL-TIME APPLICATIONS

SAP ESP

SAP ASE

Replication

Server

SAP SQL Anywhere

SAP IQ

SAP Data

Services

Extended Application Services

SAP Data Management Portfolio

End-to End Data Management & App Platform for Real-Time Business

Database

(6)

Time Value of Data

Time

Value

Last time accessed Value of immediate data access declines

When you need it again

Archive Access Event

• Regulatory audit

• Business critical reference data • Source data

(7)

• Size and cost constraints may prohibit all in-memory solution

• Not all data has the same value

• Warm data has lower latency requirements than hot data

Why is warm data

management important for

SAP HANA?

• SAP HANA dynamic tiering excels at ad hoc queries on structured

data from terabyte to petabyte scale

• SAP HANA dynamic tiering is a deeply integrated, high performance

solution in a single system

SAP HANA dynamic tiering

utilizes disk backed, smart

column store technology based

on SAP IQ

Why is SAP HANA dynamic

tiering the best solution for

warm data management?

• Hadoop has unlimited capacity for raw data processing

• Hadoop is best suited for batch processing of raw, unstructured

data

• Hadoop is an external data store with technical integration into

HANA – with higher TCO in order to manage the additional system

What about Hadoop for warm

data storage and processing?

Warm/Cold Data Management

Questions about SAP HANA dynamic tiering

(8)

Manage data cost effectively, yet with desired

performance based on SLAs

Handle very large data sets – terabytes to

petabytes

Update and query all data seamlessly via

HANA tables

Application defines which data is “hot”, and

which data is “warm”

Native Big Data solution to handle a

large percentage of enterprise data needs

without Hadoop

Hot Store

Fast data movement and optimized push down query processing

SAP HANA System with dynamic tiering option

Worker host Worker host Worker host ES host Column

Table Row Table Extended Table

Warm Store

HANA application

HANA Database

Introducing SAP HANA dynamic tiering

(9)

Hot

Warm

Data for daily reporting,

other high-priority data

Other data required to

operate the application

NLS

Data that is (normally) not updated, infrequently accessed

Traditional Archive

Data that‘s kept for legal reasons or similar

Externalize

Data Qualities and Data Temperatures

How to think about it

SAP HANA Platform

Data in the database

Different data temperatures

 Maximum access performance

Hot data - always in memory

 Reduced access performance:

Warmdata - not (always) in memory

All part of the database’s data image

Data moved out of the database

Different data qualities

 Available for read access

Near-line storage

 Not accessible without IT process Traditional archive

Data is stored and managed outside of the application database

(10)

SAP HANA Database

Hot data

Warm data

Primary image in memory

Durability

Cache / Processing Primary Image on disk

Dynamic Tiering

All in one

database

Hot Store

Warm Store

RAM

SAP HANA dynamic tiering

Map data priorities to data management

Hot Store- Classic HANA tables

Primary data image in memory

DB algorithms optimized for in-memory data

Persistence on disk to guarantee durability

Warm Store -Extended Tables

Primary data image on disk

Data processing using algorithms optimized

for disk-based data

Main memory used for caching and

processing.

(11)

Implementation choices

(12)

SAP HANA dynamic tiering

one database / one experience

for HANA

application developers and admins

SAP HANA dynamic tiering

• Reduced TCO

• Optimized for performance

• Single database experience

• Centralized operational control

Centralized monitoring / admin High speed data ingest Common installer and licensing model Unified backup and restore Integrated security Optimized query processing

SAP HANA

dynamic

tiering

(13)

SAP HANA dynamic tiering

The overall system layout

SAP HANA with dynamic tiering consists of two types of hosts:

• Regular worker hosts (running the classical HANA processes: indexserver, nameserver, daemon, xsserver,…)

• HANA hosts can be single-node or scale-out; appliance or TDI

• “ES hosts” (running nameserver, daemon, and esserver) • esserver is the database process of the warm store

Hot Store

Fast data movement and optimized push down query processing

SAP HANA System with dynamic tiering service

Worker host(*) Worker host Worker host

Client

Application

Connect ES host (controller) Further ES hosts Column

Table Row Table

Extended Table

Warm Store

(*) Standby hosts not shown

• One single SAP HANA database: one SID, one instance number

• All client communication happens through index server / XS server

(14)

Database Catalog

HANA Extended Tables

HANA Database

Warm

Store

Data

HANA extended table

schema is part of HANA

database catalog

HANA extended table

data resides in warm

store

HANA extended table is

a first class database

object with full ACID

compliance

Hot

Store

Table Definition

Data

Table Definition

Classical HANA

column/row table

Extended table

(warm table)

(15)

High Speed Data Ingest

Warm Extended

Table

IMPORT FROM CSV FILE ‘data.csv’ INTO t_extended CSV DATA

Hot HANA

column Table

Materialization

Data movement between hot and warm store

HANA Database

Import from CSV files:

IMPORT FROM CSV FILE ‘bigfile.csv’ INTO t1

Bulk array insert:

INSERT INTO t1 (col1, col2, col3...) VALUES (val1, val2, val3...)

High-speed data movement between HANA tables and HANA extended tables:

INSERT INTO t_extended select c1 FROM t_hana

Concurrent inserts from multiple connections:

A HANA extended table may be a DELTA enabled table, which allows multiple concurrent writes

(16)

Optimized Query Processing

Parallel query processing

• Data is pulled from HANA hot store into HANA warm store query processing engine using multiple streams, and processed in parallel

Push/Pull query optimization and transformation

Query operations ship to hot or warm store as appropriate for native

performance

Extended tables may be used in HANA CALC views

HANA Calc engine and HANA SQL engine share extended table query

performance optimizations

Joining Grouping Ordering T3 T4 T1 T2

(17)

Example Query Plan

Customer is a native HANA table in HANA memory

Product is a HANA extended table in the warm store select "account_num", count(*) as account_count from VXM_FOODMART.CUSTOMER C where

"lname" >= 'Ga' and "lname" < 'Gb' and exists ( select * from VXM_IQSTORE.PRODUCT P where "product_id" = "customer_id" ) group by "account_num" order by "account_num";

(18)

HANA Monitoring and Administration

HANA Cockpit:

• New, web based monitoring and administration console for HANA Extended Storage

• HANA Studio will be used for design and modeling of HANA extended tables

• HANA Cockpit displays status, CPU/memory/storage resource utilization, table usage statistics

• Provides access to and search of server logs and custom traces

• Shows alerts triggered by extended storage

• Enables administration of extended storage: add and drop storage, or increase size of file

User Tables By top usage Top 14 Total 100 10 ES 100 CL/RW 30 MB 50 MB 200 MB 30 MB 20 Top 100 Totals 100 times / day

(19)

• HANA backup manages backup of both hot and warm store

• Point in Time Recovery (PITR) is supported

HANA

Extended

Storage

Data backups (manual or scheduled) Log backups (automatic, or none) Data backup

Log backup System crash

Restore

Time

t1 t2 t3

Data backups with log backups allow restore to

Point in Time or most recent state: t1-> t3

Data backups alone allow

restore to specific backup only: t1 or t2

Log area

Backup History

(20)

• High availability

Compute node failure will result in failover to standby node

(manual for warm store nodes)

Storage failure will depend on inherent storage vendor disk

mirroring and fault tolerance capabilities

• Hot and warm store should use the same storage to facilitate auto-failover in the future

• Disaster recovery

HANA without dynamic tiering supports continuous

replication to maintain a disaster recovery site

HANA with dynamic tiering will maintain a disaster recovery

site through backup and restore capabilities only

• Disaster recovery through system replicationis planned for a future release

• Disaster recovery through storage replication may be added independently from software releases

High Availability and Disaster Recovery

Classical HANA services

Compute node

Hot Store

Warm Store Service

Compute node Standby node Manual Failover Standby node Warm Store Auto-Failover mirror mirror

(21)

Each extended store is dedicated to exactly one tenant database:

SAP HANA Multitenant Database Containers

HANA Cluster

Compute node Tenant Database Extended Store Tenant Database Extended Store Tenant Database Compute node Compute node Compute node (No ES)

(22)

ES may be added to certd. HANA storage, or may be using individual storage Certd. HW Box Certd. HW Box Certd. HW Box HANA Scale-Out Certd. HW Box

Node 1 Standby Node

ES DB logs warm data hot data ES DB Node

Hardware Layout View

Recommended Option: Use Homogeneous Hardware for All Hosts

Node 2

HANA Clients (HANA Studio, ...)HANA Clients (HANA Studio, ...)HANA Clients (DB clients, Studio, ...)

2

3

HANA System (One SID)

1

2 1

Intra-node Network

Client Network 3 Storage Network for HANA and ES

Non-certd. Storage for /hana/shared/

redo logs binaries, traces, core

dumps hot

data redo logs

(23)

Non-certd. Storage for ES Certd. HW Box Certd. HW Box Certd. HW Box HANA Scale-Out Non-certd. HW Box

Certd. Storage for data and redo logs of HANA

Node 1 Standby Node

ES DB logs warm data hot data ES DB Node

Hardware Layout View

Alternative Option: Use Individual Hardware

Node 2

HANA Clients (HANA Studio, ...)HANA Clients (HANA Studio, ...)HANA Clients (DB clients, Studio, ...)

2

3

HANA System (One SID)

1

2 1

Intra-node Network

Client Network 3 HANA Storage Network

Non-certd. Storage for /hana/shared/

redo logs binaries, traces, core

dumps hot data redo logs 4 4 ES Storage Network

(24)

SAP BW and native HANA applications

(25)

© 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 25

Frequent reporting and/or HANA-native operations

SAP NetWeaver BW powered by SAP HANA

Data Classification by Object Type

BW – Operational Data

Data Categories in a BW System

Staging Layer

Analytic Mart

Business Transformation

EDW Propagation

EDW Transformation

C

o

rp

o

rate

M

emo

ry

A

rch

iv

e/

N

L

S

“Old”, “out-of-use” data – Archive, read-only, different SLAs

Limited reporting, limited HANA-native operations

(26)

© 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 26

SAP HANA database

Database Catalog

Extended Tables in HANA BW

Use Case: Staging and Corporate Memory

Object Classification in BW

Data Sources and write-optimized

DSOs can have the property

“Extended Table”

 Generated Tables are of type “Extended”

 All BW standard operations supported – no changes

 Only minor temporary RAM required in HANA

InfoCubes and Regular or Advanced

DSOs

 Generate standard column table

Hot Store

Warm store

BW System

Write-optimized DSO Corporate Memory Data Source Staging Area Table Schema

Data

PSA Table Table Schema

Data

Active Table InfoCube Data Mart Table Schema

Data

Fact Table

(27)

SAP HANA dynamic tiering for Big Data

SAP HANA with Dynamic Tiering provides native Big Data solution

Cutting edge, in-memory platform

Transact/analyze in real-time

Native predictive, text, and spatial

algorithms

Petascale,

HANA extended

tables

• Petascale extension to HANA with disk backed,

columnar database technology

• Expand HANA capacity with warm/cool structured

data in HANA warm store

• Tight integration between HANA hot store and

HANA warm store for optimal performance

Hot data

SAP HANA

Petascale, warm structured data

HANA extended

tables

(28)

HANA with Dynamic Tiering

Native Big Data solution for a multitude of use cases

P

SAP HANA Dynamic Tiering for Big Data Use Cases across Industries

Telecommunications: Network service data in HANA extended tables analyzed and

correlated with customer loyalty data in SAP HANA, to anticipate customer churn and initiate customer retention response activities.

Financial services: Stock tick data streamed into SAP HANA for immediate price

fluctuation analysis and trading actions, with historical stock price data stored in HANA

extended tables for trend analysis and portfolio management.

Public utilities: enterprise data stored in SAP HANA and large amounts of smart

meter data stored in HANA extended tables, to identify operational problems, and establish incentive pricing for more efficient energy use.

Airline route profitability analysis: SAP HANA analyzes revenue, variable operating

costs (fuel, landing fees...), and fixed operating costs in real time to make decisions on network, pricing, and marketing to determine where to fly, when, and how often. All data must be analyzed in real time.

(29)

Future DirectionDirection

(30)

SAP HANA dynamic tiering roadmap

FUTURE

• HANA ES host auto-failover (HA)

• SAP HANA system replication for disaster

recovery

• Enhanced backup and restore (BACKINT

and storage snapshots)

• Hybrid extended tables with rule based

automatic data movement / aging

• Further performance optimizations for

HANA Calculation Engine

• Series data support in extended tables • Support of extended tables in Core Data

Services (CDS)

PLANNED

• SAP HANA dynamic tiering available to

be used by any HANA application

• Common installer

• Unified administration and monitoring

using HANA Cockpit

• Extended Storage (ES) engine is part of

HANA topology

• Single authentication model • Single licensing model

• Combined error log / trace handling • Fully integrated backup/restore

(31)

Automatic, rules-based, asynchronous data movement between hot and warm stores Hot partitions in HANA memory; remaining partitions in warm store

Single HANA table that spans hot and warm stores

Hybrid extended tables

Hot data in

HANA tier

Warm data In warm

tier

2012

2012

Hybrid

Extended

Table

aging regulatory audit

(32)

THANK YOU FOR PARTICIPATING

Please provide feedback on this session by completing

a short survey via the event mobile application.

SESSION CODE: BI474

For ongoing education on this area of focus,

visit www.ASUG.com

References

Related documents

SAP DB / HANA InfoCube Data Store Object BW Database / SAP HANA SAP Nearline Provider SAP DB / HANA PBS CBW NLS IQ Nearline Database MultiProvider PBS Nearline Services

The product road map for SAP BusinessObjects Cloud includes providing the best data connectivity to on-premise data sources in the SAP HANA database, SAP S/4HANA suite, and

 SAP predictive analysis can support additional PAL functions and invoke SAP HANA +R processing  SAP HANA as a repository and audit database for the SAP BusinessObjects BI

SAP HANA is an innovative in-memory database technology that leverages the low cost of computer server memory (RAM), the data processing abilities of multi- core processors, and

Hitachi Data Systems has partnered with SAP to create a high-performance and scalable big data appliance, combining the strength of SAP HANA In-Memory Analytics and the benefits

Figure 13: Reference Architecture Hive Hive bI tools Hadoop Data warehouse analytics Archive Management Data exchange (SAP Data Services) Data warehouse/database (SAP HANA®,

In your examination, you find that the balance of Accounts Receivable represents sales of the current audit year only; that In your examination, you find that the balance of

Porta Caeli +.