Extreme Data Warehouse Performance
with Oracle Exadata
Kasey Parker
Managed Services Cloud Services Consul3ng Services Licensing• Managed Services • Cloud Services • Consul3ng Services • Licensing
Who is Centroid?
§
Centroid is a leading provider of Oracle Technology, Applica8ons and
Infrastructure/Hos8ng solu8ons
§
Established in 1997
§
Office loca8ons: Troy, MI (HQ); San Francisco, CA; Los Angeles, CA; Dallas, TX
§
200+ Consultants
§
Oracle Pla8num Partner
•
Selected to Oracle’s Top 25 Strategic Partner Program
•
Top 5 Oracle Partner for Hardware/Storage
§
100% Oracle “Red Stack” Focused
§
“Clients for life” approach to customer rela8onships
§
Oracle Exadata Center of Excellence established in 2011
•
Centroid Authored -‐ Oracle Exadata Recipes (Published Feb-‐2013)
Agenda
§
Exadata Overview
§
Why Exadata?
§
Exadata’s Secret Sauce
§
GeAng the Most out of Exadata DW
§
Avoiding the 3X Club
Exadata Architecture
Database hardware and soIware plaKorm “in a box”
Scale-‐Out Database Servers
•
8x 2-‐socket, or 2x 8-‐socket Xeon database servers
•
Oracle Database, ASM, RAC; Linux or Solaris
•
Standard Ethernet to data center
Scale-‐Out Intelligent Storage Servers
•
2-‐socket storage servers, Exadata Storage SoIware
•
Up to 672 terabytes disk per rack
•
56 PCI Flash memory cards per rack
InfiniBand Network
Exadata Configura3on Op3ons
Start small and grow as needed – upgraded onsite
Half Rack
Full Rack
Quarter Rack
Exadata Hardware Summary
X4-2 Full X4-2 Half X4-2 Quarter X4-2 Eighth
Database Servers 8 4 2 2
Database Grid Cores 192 96 48 24
Database Grid Memory (GB) 2048 (max 4096) 1024 (max 2048) 512 (max 1024) 512 (max 1024)
InfiniBand switches 2 2 2 2
Ethernet switch 1 1 1 1
Exadata Storage Servers 14 7 3 3
Storage Grid CPU Cores 168 84 36 18
Raw Flash Capacity 44.8 TB 22.4 TB 9.6 TB 4.8 TB
Raw Storage Capacity
High Perf 200 TB 100 TB 43.2 TB 21.6 TB
High Cap 672 TB 336 TB 144 TB 72 TB
Usable mirrored capacity
High Perf 90 TB 45 TB 19 TB 9 TB
High Cap 300 TB 150 TB 63 TB 30 TB
Usable Triple mirrored capacity
Exadata Hardware
Exadata X4-‐2 SQL IO Performance
1 -‐ Bandwidth is peak physical scan bandwidth achieved running SQL, assuming no compression. Effec3ve data bandwidth will be much higher when compression is factored in.
2 -‐ IOPS – Based on read IO requests of size 8K running SQL, typically with sub-‐millisecond latencies. Note that the IO size greatly effects flash IOPS. Others quote IOPS based on 2K, 4K or smaller IOs that are not relevant for databases and measure IOs using low level tools instead of SQL.
3-‐ Actual Performance varies by applica3on.
4 –Load rates are typically limited by database server CPU, not IO. Rates vary based on load method, indexes, data types, compression, and par33oning
X4-2
Full Rack
Half Rack
X4-2
Quarter
X4-2
Eighth
X4-2
Flash Cache
SQL Bandwidth
1,3High Cap Disk
100 GB/s
50 GB/s
21.5 GB/s
10.7 GB/s
High Perf Disk
100 GB/s
50 GB/s
21.5 GB/s
10.7 GB/s
Flash SQL IOPS
2,38K Reads
2,660,000
1,330,000
570,000
285,000
8K Writes
1,960,000
980,000
420,000
210,000
Disk SQL
Bandwidth
1,3High Cap Disk
20 GB/s
10 GB/s
4.5 G/s
2.25 GB/s
High Perf Disk
24 GB/s
12 GB/s
5.2 GB/s
2.6 GB/s
Disk SQL IOPS
High Cap Disk
32,000
16,000
7,000
3,500
High Perf Disk
50,000
25,000
10,800
5,400
Why Exadata?
Exadata is designed to
eliminate the most common
bomleneck for large
databases…
Timely transfer of large data
sets from storage subsystem to
database server
Why Exadata?
Solving the IO BoTleneck
Solu3on 1: Enlarge the pipe
Why Exadata?
Can’t we do that with other high
performance storage soluVons?
YES…
There is nothing Magical about
Exadata hardware, and it’s s3ll the
same Oracle Database
Why Exadata?
Solving the IO BoTleneck
Solu3on 2: Reduce the IO opera3ons
•
Done using Exadata’s Secret Sauce: Smart Storage, Smart Flash
Cache and Hybrid Columnar Compression
Exadata Innova3ons
•
Some are automa3c, with limited
configura3on ability
–
Storage Indexes
–
Smart Flash Cache
•
Some may require some effort
–
Smart Scans
–
Hybrid Columnar Compression (HCC)
Storage Indexes
•
Exadata Storage Indexes maintain summary
information about table data in memory
•
Store MIN and MAX values of columns
•
Typically one index entry for every MB of disk
•
Eliminates disk I/Os if MIN and MAX can never
match “where” clause of a query
•
Completely automatic and transparent
A B C D
1
3
5
5
8
3
Min B = 1
Max B =5
Table
Index
Min B = 3
Max B =8
Select * from Table where B<2 - Only first set of rows can match
Smart Flash Cache
•
Caches Read and Write I/Os in PCI flash
•
Transparently accelerates read and write intensive
workloads
–
Up to 2.66 million 8K read IOPS from SQL
–
Up to 1.96 million 8K write IOPS from SQL
•
Persistent write cache speeds database recovery
•
Exadata Flash Cache is much more effec3ve than
flash 3ering architectures used by others
–
Caches current hot data, not yesterday’s
–
Caches data in granules 8x to 16x smaller than 3ering
•
Greatly improves the effec3veness of flash
I/Os
2.66 Million 8K Read
1.96 Million 8K Write
IOPS from SQL
Other Flash Features can be configured if needed
Avoid the 3X Club
Some Exadata op3miza3ons may require
a limle effort – but they’re worth it.
Data Warehouse workloads should
improve >7X on Exadata
Avoid the 3X Club
•
Tune for Smart Scans
•
Wisely use Parallelism
•
Compress with HCC where appropriate
•
Invoke Resource Management (IORM)
Avoid the 3X Club – an Example
EDW for Large Organiza3on in Salt Lake valley
•
Moved to Exadata beginning September 2012
•
Configured/Tuned Exadata op3miza3ons for October 2012
Average Response Time
Avoid the 3X Club
•
Tune for Smart Scans
•
Wisely use Parallelism
•
Compress with HCC where appropriate
•
Invoke Resource Management (IORM)
Smart Scan Processing
Select name, customer#...
Where city=‘SALT LAKE CITY’
•
Smart Scan idenVfies rows / columns
in the 1 TB tables that match the SQL
(1000 rows)
•
IO is executed and 20MB
returned from storage to
PGA
Who are my
customers in
Salt Lake
City?
Oracle DB
Grid
Exadata
Storage
Grid
•
1000 rows returned to
client
Smart Scan Comparison
8K
Blocks
SGA
Rows and
Columns
PGA
Standard
Operations
Smart Scans
Storage Servers
Database Servers
Smart Scan Requirements
•
Full table scan or index fast full scan
–
No IOTs, Clustered Tables or LOBs
•
Direct path reads
–
Direct path reads happen for
•
Serial queries of “large” tables (11gR2)
–
Func3on of Buffer Cache Size, threshold and object size
»
_small_table_threshold
•
Parallel queries
Smart Scans – How do you know?
Execu3on Plan
•
TABLE ACCESS STORAGE FULL
•
Storage() predicate
•
Only indicates Smart Scan is eligible to be
performed; does not mean it is
Smart Scans – How do you know?
•
Sta3s3c views (V$MYSTAT, V$SESSTAT)
–
cell physical IO bytes eligible for predicate offloading
–
cell physical IO interconnect bytes
–
cell physical IO interconnect bytes return by smart scan
•
V$SQL views (IO_ columns)
–
IO_CELL_OFFLOAD_RETURNED_BYTES
–
IO_CELL_OFFLOAD_ELIGIBLE_BYTES
•
Wait events
–
cell smart table scan
–
cell smart index scan
Smart Scans – How do you know?
A Easier Way…
SQL Monitor
Smart Scans – Why don’t they happen?
•
Index scan used instead
•
Buffer cache too large
–
Many table blocks in buffer cache
•
Chained rows
–
Tables with more than 255 columns
•
Certain func3ons (see
v$sqlfn_metadata
)
•
Table "too small” (
_small_table_threshold
)
!
•
Read consistency
Smart Scans – How to get them?
•
Accurate, Up-‐to-‐date Sta3s3cs
–
Are ETL jobs gathering stats appropriately?
–
Use auto sample size
–
Exadata System stats
•
This is how the op3mizer becomes Exadata aware
• exec dbms_stats.gather_system_stats('EXADATA');!
•
Right Sized SGA
–
Most Data warehouses shouldn’t need more than 16GB
•
Avoid row by row processing
•
Appropriate use of Indexes
To Index or Not to Index
So if Smart Scans are so great do we even need
indexes anymore?
YES!...
You s3ll need indexes for queries with
single/few out of many row reads
Also keep many FK indexes – especially
if used for Star Transforma3ons
To Index or Not to Index
•
Many indexes will be obsolete and should be
removed to help drive smart scans
•
Test by:
–
Making indexes invisible and tes3ng queries
Avoid the 3X Club
•
Tune for Smart Scans
•
Wisely use Parallelism
•
Compress with HCC where appropriate
•
Invoke Resource Management (IORM)
Parallelism on Exadata
•
Parallelism executes the same on or off Exadata
•
PX works much bemer on Exadata and can be a big
performance boost
–
Pushes Direct Path Reads to enable smart scans
–
Exadata architecture enables parallelism through
storage cell CPUs and disks all working together
•
Load split across DB and Cell CPUs
•
Allows lower DOP on Exadata to achieve op3mal
performance
•
Easy to overwhelm a system with Parallelism
Parallelism Guidelines
•
Control parallel load
–
Parallel init parameters
–
Parallel Statement Queuing
–
DBRM resource plans
•
Set parallel degree limits and max % targets
•
Set parallel degree on large tables
– ALTER TABLE [TABLE NAME] PARALLEL 12;
•
Use parallelism for direct path loads in ETL
Key Parallel Init Parameters
•
PARALLEL_MAX_SERVERS
•
Max # of instance parallel workers
•
Recommend leaving at default (CPU_COUNT *
PARALLEL_THREADS_PER_CPU*10)
•
PARALLEL_MIN_SERVERS
•
Min # of instance parallel workers (default 0)
•
Helps control overhead of crea3ng and destroying
workers
•
Recommend seAng to high daily average of
workers
See Oracle Support Note
1274318.1 for Exadata
Parallel Init Parameters
AUTO DOP
•
Enabled by
parallel_degree_policy !
•
Manual (Default), Limited, Auto
•
Each statement automa3cally evaluated as a
candidate for parallelism; whether or not statements
contain parallel hints or objects have a DOP set
•
Controlled by
parallel_min_time_threshold
•
10 seconds by default
•
Statements expected to run longer are candidates for
automa3c paralleliza3on
Parallel Statement Queuing
•
Limits concurrent parallel processes un3l enough
slaves are available
•
Protects against overwhelming the server with
parallel processes
•
Delivers a more consistent performance profile
•
Can be enabled without Auto DOP by seAng
_parallel_statement_queuing = TRUE!
•
Control when queuing starts by using
PARALLEL_SERVER_TARGET!
•
Statements queued in FIFO method
!
Parallel Statement Monitoring
•
OEM / Grid Control
!
–
SQL Monitoring specifically
•
GV$PX PROCESS
–
One record per Parallel Worker
•
GV$SQL_MONITOR
–
Also shows queued parallel statements
See Oracle Support Note
135043.1 for more
monitoring queries
Avoid the 3X Club
•
Tune for Smart Scans
•
Wisely use Parallelism
•
Compress with HCC where appropriate
•
Invoke Resource Management (IORM)
Hybrid Columnar Compression
• Data is organized and compressed by
column in compression units (CU)
• Speed Optimized Query Compression for
Data Warehousing
•
5X to 10X compression typical
•
Runs faster because of Exadata offload!
• Space Optimized Archival Compression
for infrequently accessed data
•
10X to 50X compression typical
Qu
er
y
Faster and Simpler
Backup, DR, Caching,
Reorg, Clone
Hybrid Columnar Compression
VENDOR_ID VEND_NAME STATE VNDR_RATING VENDOR_TYPE ========== =========== ===== =========== ========== 100 ACME ONE MI 100 DIRECT 101 ACME ONE CA 90 DIRECT 102 NORTON IA 95 INDIRECT 103 WINGDINGS MI 96 INDIRECT 104 WINGDINGS GA 96 INDIRECT 100ACME ONEMI100DIRECT| 101ACME ONECA()DIRECT| 102NORTONIA95INDIRECT| 103WINGDINGSMS96INDIREC T| 104WINGDINGSGA96INDIREC T
Free space
Uncompressed
Hybrid Columnar Compression
Logical Compression Unit
<-‐ Header -‐> CU Header-‐>