Extreme Data Warehouse Performance with Oracle Exadata

(1)

Extreme Data Warehouse Performance

with Oracle Exadata

Kasey Parker

Managed Services Cloud Services Consul3ng Services Licensing

(2)

• Managed Services • Cloud Services • Consul3ng Services • Licensing

Who is Centroid?

§ 

Centroid is a leading provider of Oracle Technology, Applica8ons and

Infrastructure/Hos8ng solu8ons

§ 

Established in 1997

§ 

Oﬃce loca8ons: Troy, MI (HQ); San Francisco, CA; Los Angeles, CA; Dallas, TX

§ 

200+ Consultants

§ 

Oracle Pla8num Partner

• Selected to Oracle’s Top 25 Strategic Partner Program

• Top 5 Oracle Partner for Hardware/Storage

§ 

100% Oracle “Red Stack” Focused

§ 

“Clients for life” approach to customer rela8onships

§ 

Oracle Exadata Center of Excellence established in 2011

• Centroid Authored -‐ Oracle Exadata Recipes (Published Feb-‐2013)

(3)

Agenda

§ 

_{Exadata
Overview}

§ 

Why Exadata?

§ 

Exadata’s Secret Sauce

§ 

GeAng the Most out of Exadata DW

§ 

Avoiding the 3X Club

(4)

(5)

Exadata Architecture

Database hardware and soIware plaKorm “in a box”

Scale-‐Out Database Servers

• 8x 2-‐socket, or 2x 8-‐socket Xeon database servers

• Oracle Database, ASM, RAC; Linux or Solaris

• Standard Ethernet to data center

Scale-‐Out Intelligent Storage Servers

• 2-‐socket storage servers, Exadata Storage SoIware

• Up to 672 terabytes disk per rack

• 56 PCI Flash memory cards per rack

InﬁniBand Network

(6)

Exadata Conﬁgura3on Op3ons

Start small and grow as needed – upgraded onsite

Half Rack

Full Rack

Quarter Rack

(7)

Exadata Hardware Summary

X4-2 Full X4-2 Half X4-2 Quarter X4-2 Eighth

Database Servers 8 4 2 2

Database Grid Cores 192 96 48 24

Database Grid Memory (GB) 2048 (max 4096) 1024 (max 2048) 512 (max 1024) 512 (max 1024)

InfiniBand switches 2 2 2 2

Ethernet switch 1 1 1 1

Exadata Storage Servers 14 7 3 3

Storage Grid CPU Cores 168 84 36 18

Raw Flash Capacity 44.8 TB 22.4 TB 9.6 TB 4.8 TB

Raw Storage Capacity

High Perf 200 TB 100 TB 43.2 TB 21.6 TB

High Cap 672 TB 336 TB 144 TB 72 TB

Usable mirrored capacity

High Perf 90 TB 45 TB 19 TB 9 TB

High Cap 300 TB 150 TB 63 TB 30 TB

Usable Triple mirrored capacity

(8)

Exadata Hardware

Exadata X4-‐2 SQL IO Performance

1 -‐ Bandwidth is peak physical scan bandwidth achieved running SQL, assuming no compression. Eﬀec3ve data bandwidth will be much higher when compression is factored in.

2 -‐ IOPS – Based on read IO requests of size 8K running SQL, typically with sub-‐millisecond latencies. Note that the IO size greatly eﬀects ﬂash IOPS. Others quote IOPS based on 2K, 4K or smaller IOs that are not relevant for databases and measure IOs using low level tools instead of SQL.

3-‐ Actual Performance varies by applica3on.

4 –Load rates are typically limited by database server CPU, not IO. Rates vary based on load method, indexes, data types, compression, and par33oning

X4-2

Full Rack

Half Rack

X4-2

Quarter

X4-2

Eighth

X4-2

Flash Cache

SQL Bandwidth

1,3

High Cap Disk

100 GB/s

50 GB/s

21.5 GB/s

10.7 GB/s

High Perf Disk

100 GB/s

50 GB/s

21.5 GB/s

10.7 GB/s

Flash SQL IOPS

2,3

8K Reads

2,660,000

1,330,000

570,000

285,000

8K Writes

1,960,000

980,000

420,000

210,000

Disk SQL

Bandwidth

1,3

High Cap Disk

20 GB/s

10 GB/s

4.5 G/s

2.25 GB/s

High Perf Disk

24 GB/s

12 GB/s

5.2 GB/s

2.6 GB/s

Disk SQL IOPS

High Cap Disk

32,000

16,000

7,000

3,500

High Perf Disk

50,000

25,000

10,800

5,400

(9)

(10)

Why Exadata?

Exadata is designed to

eliminate the most common

bomleneck for large

databases…

Timely transfer of large data

sets from storage subsystem to

database server

(11)

Why Exadata?

Solving the IO BoTleneck

Solu3on 1: Enlarge the pipe

(12)

Why Exadata?

Can’t we do that with other high

performance storage soluVons?

YES…

There is nothing Magical about

Exadata hardware, and it’s s3ll the

same Oracle Database

(13)

Why Exadata?

Solving the IO BoTleneck

Solu3on 2: Reduce the IO opera3ons

• Done using Exadata’s Secret Sauce: Smart Storage, Smart Flash

Cache and Hybrid Columnar Compression

(14)

Exadata Innova3ons

• Some are automa3c, with limited

conﬁgura3on ability

– 

Storage Indexes

– 

Smart Flash Cache

• Some may require some eﬀort

– 

Smart Scans

– 

Hybrid Columnar Compression (HCC)

(15)

Storage Indexes

• Exadata Storage Indexes maintain summary

information about table data in memory

• Store MIN and MAX values of columns

• Typically one index entry for every MB of disk

• Eliminates disk I/Os if MIN and MAX can never

match “where” clause of a query

• Completely automatic and transparent

A B C D

1

3

5

8

3 Min B = 1

Max B =5

Table

Index

Min B = 3

Max B =8

Select * from Table where B<2 - Only first set of rows can match

(16)

Smart Flash Cache

• Caches Read and Write I/Os in PCI ﬂash

• Transparently accelerates read and write intensive

workloads

– 

Up to 2.66 million 8K read IOPS from SQL

– 

Up to 1.96 million 8K write IOPS from SQL

• Persistent write cache speeds database recovery

• Exadata Flash Cache is much more eﬀec3ve than

ﬂash 3ering architectures used by others

– 

Caches current hot data, not yesterday’s

– 

Caches data in granules 8x to 16x smaller than 3ering

• Greatly improves the eﬀec3veness of ﬂash

I/Os

2.66 Million 8K Read

1.96 Million 8K Write

IOPS from SQL

Other Flash Features can be conﬁgured if needed

(17)

Avoid the 3X Club

Some Exadata op3miza3ons may require

a limle eﬀort – but they’re worth it.

Data Warehouse workloads should

improve >7X on Exadata

(18)

Avoid the 3X Club

• Tune for Smart Scans

• Wisely use Parallelism

• Compress with HCC where appropriate

• Invoke Resource Management (IORM)

(19)

Avoid the 3X Club – an Example

EDW for Large Organiza3on in Salt Lake valley

• Moved to Exadata beginning September 2012

• Conﬁgured/Tuned Exadata op3miza3ons for October 2012

Average Response Time

(20)

Avoid the 3X Club

• Tune for Smart Scans

• Wisely use Parallelism

• Compress with HCC where appropriate

• Invoke Resource Management (IORM)

(21)

Smart Scan Processing

Select name, customer#...

Where city=‘SALT LAKE CITY’

• Smart Scan idenVﬁes rows / columns

in the 1 TB tables that match the SQL

(1000 rows)

• IO is executed and 20MB

returned from storage to

PGA

Who are my

customers in

Salt Lake

City?

Oracle DB

Grid

Exadata

Storage

Grid

• 1000 rows returned to

client

(22)

Smart Scan Comparison

8K

Blocks

SGA

Rows and

Columns

PGA

Standard

Operations

Smart Scans

Storage Servers

Database Servers

(23)

Smart Scan Requirements

• _{Full
table
scan
or
index
fast
full
scan}

– 

No IOTs, Clustered Tables or LOBs

• _{Direct
path
reads}

– 

Direct path reads happen for

• Serial queries of “large” tables (11gR2)

– 

Func3on of Buﬀer Cache Size, threshold and object size

» 

_small_table_threshold

• Parallel queries

(24)

Smart Scans – How do you know?

Execu3on Plan

• TABLE ACCESS STORAGE FULL

• Storage() predicate

• Only indicates Smart Scan is eligible to be

performed; does not mean it is

(25)

Smart Scans – How do you know?

• _{Sta3s3c
views
(V$MYSTAT,
V$SESSTAT)}

– 

cell physical IO bytes eligible for predicate oﬄoading

– 

cell physical IO interconnect bytes

– 

cell physical IO interconnect bytes return by smart scan

• _{V$SQL
views
(IO_
columns)}

– 

IO_CELL_OFFLOAD_RETURNED_BYTES

– 

IO_CELL_OFFLOAD_ELIGIBLE_BYTES

• Wait events

– 

cell smart table scan

– 

cell smart index scan

(26)

Smart Scans – How do you know?

A Easier Way…

SQL Monitor

(27)

Smart Scans – Why don’t they happen?

• _{Index
scan
used
instead}

• _{Buﬀer
cache
too
large}

– 

Many table blocks in buﬀer cache

• _{Chained
rows}

– 

Tables with more than 255 columns

• Certain func3ons (see

v$sqlfn_metadata

)

• Table "too small” (

_small_table_threshold

)

!

• Read consistency

(28)

Smart Scans – How to get them?

• Accurate, Up-‐to-‐date Sta3s3cs

– 

Are ETL jobs gathering stats appropriately?

– 

Use auto sample size

– 

Exadata System stats

• This is how the op3mizer becomes Exadata aware

•  exec dbms_stats.gather_system_stats('EXADATA');!

• Right Sized SGA

– 

Most Data warehouses shouldn’t need more than 16GB

• Avoid row by row processing

• Appropriate use of Indexes

(29)

To Index or Not to Index

So if Smart Scans are so great do we even need

indexes anymore?

YES!...

You s3ll need indexes for queries with

single/few out of many row reads

Also keep many FK indexes – especially

if used for Star Transforma3ons

(30)

To Index or Not to Index

• _{Many
indexes
will
be
obsolete
and
should
be}

removed to help drive smart scans

• _{Test
by:}

– 

_{Making
indexes
invisible
and
tes3ng
queries}

(31)

Avoid the 3X Club

• Tune for Smart Scans

• Wisely use Parallelism

• Compress with HCC where appropriate

• Invoke Resource Management (IORM)

(32)

Parallelism on Exadata

• Parallelism executes the same on or oﬀ Exadata

• PX works much bemer on Exadata and can be a big

performance boost

– 

Pushes Direct Path Reads to enable smart scans

– 

Exadata architecture enables parallelism through

storage cell CPUs and disks all working together

• Load split across DB and Cell CPUs

• Allows lower DOP on Exadata to achieve op3mal

performance

• Easy to overwhelm a system with Parallelism

(33)

Parallelism Guidelines

• _{Control
parallel
load}

– 

Parallel init parameters

– 

Parallel Statement Queuing

– 

DBRM resource plans

• Set parallel degree limits and max % targets

• _{Set
parallel
degree
on
large
tables}

–  ALTER TABLE [TABLE NAME] PARALLEL 12;

• _{Use
parallelism
for
direct
path
loads
in
ETL}

(34)

Key Parallel Init Parameters

• _{PARALLEL_MAX_SERVERS}

• Max # of instance parallel workers

• Recommend leaving at default (CPU_COUNT *

PARALLEL_THREADS_PER_CPU*10)

• _{PARALLEL_MIN_SERVERS}

• Min # of instance parallel workers (default 0)

• Helps control overhead of crea3ng and destroying

workers

• Recommend seAng to high daily average of

workers

See Oracle Support Note

1274318.1 for Exadata

(35)

Parallel Init Parameters

AUTO DOP

• Enabled by

parallel_degree_policy !

• Manual (Default), Limited, Auto

• Each statement automa3cally evaluated as a

candidate for parallelism; whether or not statements

contain parallel hints or objects have a DOP set

• Controlled by

parallel_min_time_threshold

• 10 seconds by default

• Statements expected to run longer are candidates for

automa3c paralleliza3on

(36)

Parallel Statement Queuing

• Limits concurrent parallel processes un3l enough

slaves are available

• Protects against overwhelming the server with

parallel processes

• Delivers a more consistent performance proﬁle

• Can be enabled without Auto DOP by seAng

_parallel_statement_queuing = TRUE!

• Control when queuing starts by using

PARALLEL_SERVER_TARGET!

• Statements queued in FIFO method

!

(37)

(38)

Parallel Statement Monitoring

• _{OEM
/
Grid
Control}

!

– 

_{SQL
Monitoring
speciﬁcally}

• _{GV$PX
PROCESS}

– 

One record per Parallel Worker

• _{GV$SQL_MONITOR}

– 

Also shows queued parallel statements

See Oracle Support Note

135043.1 for more

monitoring queries

(39)

Avoid the 3X Club

• Tune for Smart Scans

• Wisely use Parallelism

• Compress with HCC where appropriate

• Invoke Resource Management (IORM)

(40)

Hybrid Columnar Compression

•  Data is organized and compressed by

column in compression units (CU)

•  Speed Optimized Query Compression for

Data Warehousing

• 5X to 10X compression typical

• Runs faster because of Exadata offload!

•  Space Optimized Archival Compression

for infrequently accessed data

• 10X to 50X compression typical

Qu

er

y

Faster and Simpler

Backup, DR, Caching,

Reorg, Clone

(41)

Hybrid Columnar Compression

VENDOR_ID VEND_NAME STATE VNDR_RATING VENDOR_TYPE ========== =========== ===== =========== ========== 100 ACME ONE MI 100 DIRECT 101  ACME ONE CA 90 DIRECT 102  NORTON IA 95 INDIRECT 103  WINGDINGS MI 96 INDIRECT 104  WINGDINGS GA 96 INDIRECT 100ACME ONEMI100DIRECT| 101ACME ONECA()DIRECT| 102NORTONIA95INDIRECT| 103WINGDINGSMS96INDIREC T| 104WINGDINGSGA96INDIREC T

Free space

Uncompressed

Hybrid Columnar Compression

Logical Compression Unit

<-‐ Header -‐> CU Header-‐>

VENDOR_ID

VEND_NAME

VNDR_RATING

STATE

VENDOR_ TYPE

COL6

COL7

COL8 COL9

COL10

(42)

Hybrid Columnar Compression

Performance Beneﬁts

• If queries select a single or subset of columns, Oracle

will only need to read from blocks on which the

columns exist

– 

This is diﬀerent than other types of compression and un-‐

compressed tables

• Not only is space saved, but also IO

• Saving IO means bemer performance!

(43)

HCC – Why Not?

• HCC requires direct path loads

– 

Conven3onal inserts use OLTP compression

• _{Deletes
against
HCC
tables
lock
en3re
CU}

• _{When
upda3ng
HCC
tables:}

– 

The updated row is migrated (i.e., deleted + re-‐

inserted into a new block, leaving a pointer behind)

– 

New row is OLTP-‐compressed

– 

Locks impact en3re CU, not just row!

(44)

HCC Use Cases

• Use OLTP compression for DW tables by default, and

then use HCC compression when

– 

Data is direct path loaded (CTAS, Insert /+ APPEND /)

– 

Data is not updated

• Or rarely updated and truncated and reloaded periodically

• Par33on tables with diﬀerent compression ra3os

– 

Updated Data = OLTP compression

– 

Heavily Queried Data = Query / Archive Low compression

– 

Cold / Archive Data = Archive High compression

• Use compression advisor to preview compress ra3o

(45)

Avoid the 3X Club

• Tune for Smart Scans

• Wisely use Parallelism

• Compress with HCC where appropriate

• Invoke Resource Management (IORM)

(46)

IORM

• IO Resource Management (IORM) governs and

meters IO from diﬀerent workloads in the Exadata

Storage Servers

• A common challenge with shared storage

infrastructure is that of compe3ng IO workloads

– 

Batch vs. OLTP

– 

Warehouse vs. OLTP

– 

Produc3on vs. Test and Development

• Compe3ng priori3es can be mi3gated by over-‐

provisioning storage, but this becomes expensive

• Exadata addresses this challenge with IORM

(47)

IORM and DBRM

• Oracle DBRM allows managing CPU and other internal

DB resources, e.g. parallelism, among compe3ng

workloads in a single database

– 

DBRM is not Exadata Speciﬁc

• With Exadata IORM integra3on, IO resources are also

controlled by DBRM

• A DBRM resource plan is also called an “intra-‐

database resource plan”

(48)

IORM Plans

Approaches for managing resource allocaVons

• Intra-‐database resource plans manage mul3ple

workloads in a single database

– 

If only one database on the Exadata machine, only an intra-‐

database resource plan is needed

• Inter-‐database resource plans manage resources

among mulVple databases on Exadata

– 

Speciﬁes alloca3ons to databases, not consumer groups

– 

Category plans allow resource control across databases by

the type of workload

– 

An IORM plan is the combina3on of an inter-‐database plan

and a category plan

(49)

IORM and DBRM

Database DBM

OM OLTP

Consumer group

Other OLTP

Consumer group

Repor3ng

Consumer group

Database XBM

Online query

Consumer group

Batch query

Consumer group

DBRM Example

(50)

IORM and DBRM

Category Plan Example

Database DBM

OM OLTP

Consumer group

Other OLTP

Consumer group

Repor3ng

Consumer group

Database XBM

Online query

Consumer group

Batch query

Consumer group

Interactive

Batch

IORM Example

All User IO = 100%

Category Plan

Interdatabase

Plan

Intradatabase

Plan

IORM

Allocation

70%

Interactive

30% Batch

40%

XBM

60%

DBM

40% XBM

60% DBM

DBM OM OLTP” 26.25% DBM OTHER OLTP: 15.75% XBM: ONLINE QUERY 28.00% XBM: BATCH QUERY 12.00% DBM: REPORTING 18.00%

30%

70%

20%

30%

50%

(52)

IORM Rules

• IORM is only “engaged” when needed

• LeIover disk alloca3on is made available to other

workloads in rela3on to the conﬁgured resource plans

– 

max limits can be set

• Background IO is priori3zed rela3ve to user IO

– 

Redo and control ﬁle writes always take precedence

– 

DBWR writes are scheduled at the same priority as user IO

• If no intra-‐database plan is set, all non-‐background IO

requests are grouped into the default

(53)

IORM Plan Syntax

(54)

IORM Monitoring

• _{IORM
Metrics
using
CELLCLI
/
DCLI}

• Metric IORM script

• See Oracle Support Note: “Tool for Gathering I/O Resource

Manager Metrics: metric_iorm.pl [ID 1337265.1]”

• OEM (Grid Control) Exadata plugin

Metric Name

Meaning

DB_IO_RQ_SM

DB_IO_RQ_LG

Total number of IO requests issues by the database

since any resource plan was set

DB_IO_RQ_SM_SEC

DB_IO_RQ_LG_SEC

IO requests per second issued by the database in

the last minute

DB_IO_WT_SM

DB_IO_WT_LG

Total number of seconds that IO requests issued by

the database waited to be scheduled

(55)

IORM

Unless you only have one database with a single

type of workload on Exadata – then you should

use IORM

In other words…

(56)

IORM Beneﬁts

EDW for Large Organiza3on in Salt Lake valley

(57)

Avoid the 3X Club

• Tune for Smart Scans

• Wisely use Parallelism

• Compress with HCC where appropriate

• Invoke Resource Management (IORM)

(58)

Follow DW Best Prac3ces

Oracle data warehousing on Exadata is s3ll

data warehousing on Oracle

(With a few incredible innova3ons J)

So…

(59)

Follow DW Best Prac3ces

Key Best PracVces

• Dimensional Model (Star Schema)

• Well-‐wrimen SQL

• Table Par33oning (par3cularly fact tables)

– 

Par33on by load frequency, sub par33on by join hash

– 

Par33on Exchange loading

• Parallel, Direct-‐Path (possibly nolog) Data Loading

– 

Including Constraint and Index management

• Query Rewrite

(60)

GeAng the Most Out of Your Exadata DW

DW Best

PracVces

Parallelism

Hybrid

Columnar

Compression

Smart Scans

(Storage Oﬄoading)

(61)

Extreme Data Warehouse Performance with Oracle Exadata