DB2 LUW : The Latest from the Lab
or
-“DB2 Cancun” : A Technical Overview
Matt Huras
IBM
Session D10
Wed Nov 12 09:45-10:45 Platform: Linux Unix Windows
Abstract
The DB2 development team has been very busy since the
ground breaking DB2 10.5 release last year. Learn about what
they've been up to since then ! While a specific feature list is
not available as of the time of writing this abstract, the topic
areas covered may include:
- the latest in-memory query performance advances with BLU acceleration
- new ways of leveraging BLU acceleration in operational & OLTP systems
- data compression advances
- pureScale advances, including new options for running pureScale on
commodity networks
- exploiting the capabilities of the latest processors
The author will focus on the value the features bring to
customers, and the key nuggets of internal design information
that will allow DBAs to understand how to get the most of
3
Safe Harbour Statement
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment,
promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM
benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
DB2 LUW : Recent History
Keper (DB2 10.5)
BLU AccelerationQuantum Leap in Perf
Quantum Drop in TCO, Storage
pureScale Advances
Continuous Availability Workload Consolidation
99+% SQL Compatibility
Galileo (DB2 10)
3x Query Performance Boost 50% Compression Boost Temporal Query
noSQL Graph Store HADR Multiple Standby
pureScale (DB2 9.8)
Virtually Unlimited Capacity Transparent Scalability Leading Availability
TCO & Performance
3x Query Performance New Index Exploitation Adaptive Compression Multi-temp Storage Real-time Warehousing Ease of Development Temporal Query 98% SQL Compattibiltiy Graph Store RCAC Reliability / Availability
pureScale Integration & Enhancements
WLM Enhancements Reorg Avoidance HADR Mutliple Standby
Virtually Unlimited Capacity
Any OLTP/ERP workload Start small; grow with your
business
Application Transparent Scaling
Avoid the risk & cost of tuning your applications to the database topology
Availability
Maintain service across planned & unplanned events
TCO & Performance
BLU Acceleration Even more pureScale
Performance Workload Consolidation with pureScale Ease of Development 99+% SQL Compatibility Index on Expression Reliability / Availability Rolling Updates HADR with pureScale pureScale Active / active
DR Enhancements Multiple HCAs at each
Member (*)
Online add/drop Member Other availability
enhancements
What’s
Next
24x7 Maintenance and Growth with pureScale
Storage server • 1 - IBM Storwize v7000
• 8 - SSD drives (2TB usable capacity)
• 4 - for data, 4 - for logs
Database servers
• SUSE Linux Enterprise Server 11 SP 1
• 6 - IBM x3950 X5s (Intel XEON X7560 @ 2.27GHz (4s/32c/64t))
• Mellanox ConnectX-2 IB Card
• 128GB system memory Update secondary CF Update primary CF Update members 1,2 & 3 Add a 4th member Start new member and add more app clients 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 Time T o ta l T ra n s a c ti o n s P e r S e c o n d Update secondary CF Update primary CF Update members 1,2 & 3 Add a 4th member Start new member and add more app
clients
4 Hour Run Duration
BLU Performance Examples from 10.5 GA
Customer
Speedup over DB2
10.1
Large Financial
Services Company
46.8x
Global ISV Mart Workload
37.4x
Analytics Reporting Vendor
13.0x
Global Retailer
6.1x
Large European Bank
5.6x
8x-25x
improvement
is common
“It was amazing to see the faster query times compared to the performance
results with our row-organized tables.
The performance of four of our
queries improved by over 100-fold! The best outcome was a query that
finished 137x faster by using BLU Acceleration.”
Introducing DB2 “Cancun” (a.k.a. 10.5.0.4)
•
Get real-time answers to all questions with instant
insight into historical and “as it happens” data
•
Improve performance while reducing complexity
and overhead
•
Get fast time-to-value using skills you already have
for Oracle database
•
Change the economics of continuous availability
with broad infrastructure choices at every price
point
•
Superior performance at lower cost for SAP and
other environments
7
Get the answers you need,… in the moment
IBM
DB2
DB2 Cancun Feature Highlights
8
Get the answers you need in the moment
•
Get real-time answers to all questions with instant
insight into historical and “as it happens” data
•
Improve performance while reducing complexity
and overhead
•
Get fast time-to-value using skills you already have
for Oracle database
•
Change the economics of continuous availability
with broad infrastructure choices at every price
point
•
Superior performance at lower cost for SAP and
other environments
• Shadow Tables
Leverage BLU to get real-time insight into operational data
•
Shadow Tables
Leverage BLU to get real-time insight into operational data
Shadow Tables : Motivation
Replication
and/or ETL
N Day(s) Latency OLTP Workload OLAP Reporting• Can I get better performance for my OLAP and reporting jobs ?
• Can I reduce the latency so that my reports are more current ?
• Can I reduce system and human costs ? Can I do this all in one system ?
Shadow Tables : Motivation
Replication
and/or ETL
N Day(s) Latency OLTP Workload OLAP ReportingShadow Tables : Motivation
OLTP Workload
OLAP Reporting
• Hmmm,… well now I have my
reports in the same system, but,…
• It took quite a while to design those
extra indexes, and I’m still not sure
it’s optimal, and,
• Those extra indexes are slowing
down my OLTP workload !
• Is it even possible to optimize for
OLTP and OLAP in the same
physical structure ? Is there a
better option ?
• I’ve heard about BLU Acceleration
and the great performance,
compression and ease-of-use it
offers. Can I use that here ?
Introducing the Shadow Tables Concept
•
DB2’s SQL Optimizer automatically
routes large queries to BLU shadow
tables
• Based on Cost
• No need to change queries to reference
shadow tables
•
Shadow tables kept up-to-date by
log-based replication
•
Benefits
• Dramatic improvement in performance for your OLAP and/or Reporting jobs • Possible increase in OLTP performance
after removing indexes
• Reduced latency ; near real-time reporting
• Reduced costs; single-system simplicity
OLTP Workload OLAP Reporting SQL Optimizer Log-based Replication
Tpc-E workload IBM Power 7 4 3.55Ghz cores 16 hardware threads 117GB RAM 4 V7000s
Shadow Table Value : Have Your Cake and Eat It To !
> 10x
Based on internal IBM testing of sample transactional and analytic workloads by replacing 4 analytical indexes in the transactional environment with BLU Shadow Tables. Performance improvement figures are cumulative of all queries in the workload. Individual results will vary depending on individual workloads, configurations and conditions Lab
te st s YM MV 6%
•
Order of magnitude increase in reporting performance !
•
Typically no impact or an increase in OLTP throughput
•
In one case we saw a
6%
increase in OLTP throughput, after removing 4 indexes
whose sole purpose was to increase the performance of the OLAP workload when
running against row-organized tables
•
In another case we saw a
2.1x
increase in OLTP throughput after removing 20
analytical indexes
OLTP Workload
OLAP Reporting
Log
CDC Capture and Apply Engine
DB2
Change Data Capture
Shadow Tables Architecture
Server
IBM InfoSphere Data Replication (aka CDC) included in DB2 AWSE and AESE (for shadow table usage)
SYSTOOLS. REPL_MQT_LATENCY Optimizer
SET CURRENT REFRESH AGE 500;
Allow the optimizer to consider routing to shadows tables that were refreshed within the last 5
minutes. Format is
yyyymmddhhmmss.nnnnnn. Set to ‘ANY’ if any latency is OK.
Up and Running with Shadow Tables in 3 Steps
1) Creating A Shadow Table
•
Shadow table is defined as a regular Materialized Query Table (MQT)
• Projection only MQT definition
• No need to figure out which aggregations, joins to include in definition
•
Include all columns or a subset
• Include only needed keys and interested measures
• Project a primary or unique key to ensure optimal replication
15
Row-organized table Shadow table
CREATE TABLE trade
(id BIGINT NOT NULL PRIMARY KEY, dts TIMESTAMP NOT NULL,
symb VARCHAR(15) NOT NULL , qty INTEGER NOT NULL ,
bid_price DECIMAL(8,2) NOT NULL , trace_price DECIMAL(8,2) ,
charge DECIMAL(10,2) ) ORGANIZE BY ROW;
CREATE TABLE trade_shadow AS (SELECT * FROM trade)
DATA INITIALLY DEFERRED REFRESH DEFERRED
ENABLE QUERY OPTIMIZATION
MAINTAINED BY REPLICATION ORGANIZE BY COLUMN;
SET INTEGRITY FOR trade_shadow ALL IMMEDIATE UNCHECKED;
•
Enable intra query parallelism for reporting / OLAP jobs
•
If not already done
•
This is important for optimal BLU performance (we assume you have not set
DB2_WORKLOAD=ANALYTICS in your OLTP database)
•
Instruct DB2 optimizer of the acceptable latency limit
•
Create the SYSTOOLS.REPL_MQT_LATENCY table
•
Updated by CDC with time of last replication
•
Checked by DB2 to see if routing can be done within specified latency limit
Up and Running with Shadow Tables in 3 Steps
2) Enable Automatic Routing to Shadow Tables
CALL SYSPROC.ADMIN_SET_INTRA_PARALLEL(‘YES’); SET CURRENT DEGREE ‘ANY’;
SET CURRENT MAINTAINED TYPES
FOR OPTIMIZATION REPLICATION; SET CURRENT REFRESH AGE 500;
Enable SMP parallelism for this connection and instruct DB2 to automatically decide how many
CPU threads to use.
Allow the optimizer to consider routing to shadows tables that were refreshed within the last 5
minutes. Set to ANY if any latency is OK.
TIP: Avoid the need to change your reporting or OLAP application by
using a CONNECT procedure to set these up.
17
CREATE OR REPLACE PROCEDURE ADMIN_SCHEMA.SHADOW_SETUP()
BEGIN
DECLARE APPLNAME VARCHAR(128);
SET APPLNAME = (SELECT APPLICATION_NAME FROM TABLE
(SYSPROC.MON_GET_CONNECTION(MON_GET_APPLICATION_HANDLE(),-1)));
IF (
APPLNAME LIKE 'report%' OR
APPLNAME = 'end_of_day_summary'
) THEN
CALL SYSPROC.ADMIN_SET_INTRA_PARALLEL('YES');
SET CURRENT DEGREE 'ANY';
SET CURRENT MAINTAINED TYPES REPLICATION;
SET CURRENT REFRESH AGE 500;
END IF;
END@
GRANT EXECUTE ON PROCEDURE
ADMIN_SCHEMA.SHADOW_SETUP TO PUBLIC@
UPDATE DB CFG USING
CONNECT_PROC "ADMIN_SCHEMA.SHADOW_SETUP"@
Sample CONNECT Procedure
Replace this with your own conditions. You can use any
other connection attributes such as user ID, etc.
Every connection will now execute the procedure
immediately after the connection is established.
Up and Running with Shadow Tables in 3 Steps
3) Use CDC to Create Subscription, Table Mapping,
and then Initiate Replication
“Refresh” indicates CDC will do a refresh
of the shadow table when replication is initiated. Be sure to include primary key as a target column. In this typical case, we have a single subscription defined for all
shadow tables in the database. TIP: mark subscription as persistent TIP: populate source table first !
TIP: If you prefer scripting, use CHCCLP cmd line
Shadow Tables : DB2 Configuration Hints
•
Required database configuration settings
•
Unicode database with IDENTITY or IDENTITY_16BIT collation
•
SORTHEAP and SHEAPTHRES_SHR : Use specific size settings appropriate for
your BLU OLAP/Reporting workload
•
Set UTIL_HEAP_SZ >= 1000000 and AUTOMATIC
• Used by BLU LOAD for best compression rate (CDC fast refresh uses DB2 LOAD)
•
Ensure log retain or archiving is enabled (required for CDC replication)
•
Recommendations
•
Set DB2_EXTENDED_OPTIMIZATION = "OPT_SORTHEAP_EXCEPT_COL
<existing_sortheap_value>“
• Tell the optimizer what sort heap value to assume when determining access plans with row-organized tables
•
AUTO_RUNSTATS and AUTO_REORG=ON
• Automatically keep shadow table stats up to date, and reclaim unused extents
•
Consider dedicated table space and buffer pool for shadow tables
• To account for different access pattern
• Increases buffer pool hit ratio for both OLTP and OLAP workloads
•
maintain_replication_mqt_latency_table
• New parameter in CDC which tells it to record the time of the last replication in the DB2 SYSTOOLS.REPL_MQT_LATENCY table
• DB2 checks this table to determine if a query can be routed to a shadow table within the specified REFRESH AGE
• Set this to true
•
acceptable_latency_in_seconds_for_column_organized_tables
• New parameter in CDC for optimizing replication with BLU
• Normally CDC will apply changes as quickly as possible
• For column-organized tables, it is usually more efficient to batch the changes
• This parameter controls the delay until the next batch of changes are sent to the database
• Recommendations :
• Keep this parameter lower than the DB2 REFRESH AGE special register
• We have found the default of 5 seconds to work well
•
fastload_refresh_commit_after_max_operations
• Limits the # of rows in each LOAD transaction during refresh
• Recommendation : Increase value to the maximum value or row count of largest row-organized source for maximum compression rate on shadow tables or maximum value CDC allows
•
global_max_batch_size
•
Controls maximum size of the JDBC batch array apply on a given target table
•
Recommendation: Increase value to 1024 to take full advantage of batching
opportunities
•
Consider using CDC’s fast apply capability, eg.
•
Group by Table : Tells CDC to reorder operations so that operations against the
same table are consecutive. This increases the effectiveness of
global_max_batch_size.
•
Parallelize by Table : similar to the Group by Table mode, but instead of
applying the reordered operations on a single database connection, the
operations are applied concurrently across multiple database connections
21
Shadow Tables : Summary
SalesBLU
insi
de
BLU
insi
de
•
Dramatically faster reporting
with BLU Acceleration!
•
Typically at least an order of
magnitude faster
•
Possibly faster OLTP
•
Via reduction in indexes on the row
tables
•
No application change
•
Database query compiler automatically
decides which format to access
Row-organized
Column-organized
DB2 Cancun Feature Highlights
23
Get the answers you need in the moment
• A new leap in the core BLU
acceleration technology
Up to 330% improvement in query workload performance
Significant improvements in ETL & INSERT/UPDATE/DELETE
performance (up to 112x in one case)
Significant string & character compression improvements
Several other enhancements in the core technology
• New hardware optimizations
For new Power 8 and Intel
•
A new leap in the core BLU
acceleration technology
Up to 330% improvement in query workload performance
Significant improvements in ETL & INSERT/UPDATE/DELETE
performance (up to 112x in one case)
Significant string & character compression improvements
Several other enhancements in the core technology
•
New hardware optimizations
For new Power 8 and Intel
• Get real-time answers to all questions with instant
insight into historical and “as it happens” data
• Improve performance while reducing complexity
and overhead
• Get fast time-to-value using skills you already have
for Oracle database
• Change the economics of continuous availability
with broad infrastructure choices at every price
point
• Superior performance at lower cost for SAP and
other environments
4 #IDUG
8x-25x
Improvement is common © 2013 IBM Corporation 4BLU is Super Fast ! Some early customer results
4 #IDUG8x-25x
Improvement is common © 2013 IBM Corporation 4BLU is Super Fast ! Some early customer results
Core BLU Technology Advances : The Revolution Continues !
25 0 50 100 150 200 250 300 350 Cognos ROLAP (AIX) Cognos BI (AIX) US Bank (AIX) European ISV (Linux) TPC-H (AIX) SAP-BW (AIX) TPC-DS (Linux)Cancun BLU Query Workload Performance
Relative to 10.5.0.1
% I m p ro v e m e n t 160% 140% 130% 148% 170% 154% 330%BLU
insi
de
BLU
insi
de
Core BLU Tech Advances : A Quick Look Under-the-Covers
More Join Technology Advances
•
Enhanced support for CHAR and VARCHAR data in BLU synopsis tables
• Previously, CHAR columns were included in the synopsis, if declared as Primary or Foreign key
• In Cancun, both CHAR(n) & VARCHAR(n) eligible for inclusion in synopsis, if n not excessive
• Benefit :
• Improved filtering for queries with predicates on these columns
• In one internal test we saw a 25 to 70 %improvement in targeted queries with VARCHAR predicates
String Synopsis Improvements
•
Foreign key filters are now …
• Applicable to more joins
• Automatically disabled at run-time if filtering rate low & dynamically re-enabled if subsequent sample shows improved filtering rate (applies to join filters as well)
•
More in-memory layout optimizations
• Particularly for joins with VARCHAR data
• Tip : Usual practice of optimizing VARCHAR length applies here;
• Eg. don’t use VARCHAR(80) if VARCHAR(20) suffices
•
For queries of the form,…
WITH my_table(c1, c2) AS (SELECT … )
SELECT … FROM my_table WHERE…;
… we have seen examples of such queries
5x, 7x, 40x
faster than before
•
Set DB2_RESOURCE_POLICY=“AUTOMATIC,MEMAFFINITY” for improved memory
affinity on large NUMA machines (currently Power only)
•
Reduced memory consumption
•
Spilling optimizations
27
Better execution plans for queries involving common table expressions
NUMA Affinity and other Memory Optimizations
Groupby Optimizations
Core BLU Tech Advances : A Quick Look Under-the-Covers
(continued)
Significant Optimizations in “Point” SQL Statements
Core BLU Tech Advances : A Quick Look Under-the-Covers
(continued)
SELECT FROM blutable WHERE pk=x … UPDATE FROM blutable WHERE pk=y … DELETE FROM blutable WHERE pk=z …
•
Statements of the form …
… benefit from
• Internal performance optimizations in primary index exploitation
• Additional isolation level support
And INSERT, IMPORT and INGEST benefit from several internal optimization
including …
•
Improved internal memory management algorithms during INSERT processing
•
Single row ETL statements are dramatically faster!
• Single row UPDATE / DELETE statements measured at up to 49x-112x faster
• Performance benefit most pronounced when updating or deleting a row identified by the primary key
• Single row INSERT…VALUES statements are up to 40%
faster
•
Insert from subselect measured at up to
35%
faster!
•
INGEST utility measured at up to
3.0x
faster!
29 La b te st s YM MV 0 20 40 60 80 100 120 10.5.0.1 Cancun
Relative Update Performance *
* UPDATE based on primary key, with all non-key columns updated
1x
112x
Core BLU Technology Advances : The Revolution Continues !
Cancun BLU INSERT/UPDATE/DELETE & ETL/ELT Performance
Relative to 10.5.0.1
BLU Compression Enhancements
• Improved anticipation of future data
• Based on customer experience and feedback, compression algorithms were tuned
to further improve anticipation of column-level dictionaries to handle future data
• Broader application of BLU page compression
•
Page compression is similar in concept to DB2’s adaptive compression for row
organized tables
• Each page can have it’s own unique compression dictionary for the values on that page
• Allows compression to adapt to changing data patterns over time
•
Deeper compression for variable length strings
•
BLU can choose deeper compression algorithms for VARCHAR and VARGRAPHIC
columns, if estimated benefits are high
Deeper VARCHAR & VARGRAPHIC Compression
0
10
20
30
40
50
60
70
80
90
100
Col 1
Col 2
Col 3
Col 4
Col 5
Col 6
Un c o mpr e sse d Pr e - Can c u n Can c u n La b te st s YM MV
Relative Storage Consumption for 6 VARCHAR Columns (*)
(Lower is Better)
Cancun Streamlines LOAD for Columnar-Organized Tables
•
Aside : What is the LOAD ANALYZE phase ?
•
Examines input data to build effective compression dictionaries for
column-organized tables
•
Used when LOADing into an empty BLU table or using LOAD REPLACE
•
Before Cancun, duration of analyze phase grows with size of loaded
data
•
Observation: analyzing a (small) subset of the data is often sufficient to
build very effective compression dictionaries
•
New in Cancun :
•
ANALYZE phase automatically examines a subset of the data only
•
Benefits :
•
Notable
reduction of LOAD time
for large data volumes
•
No or very low impact
on compression for most cases
•
Reduced need for temporary staging files when loading from pipe
320
0.5
1
1.5
2
2.5
3
3.5
4
4.5
US Bank
Major Manufacturer
Pre-Cancun
Cancun
La b te st s YM MV Load Duration (Lower is Better)Cancun Streamlines LOAD for Columnar-Organized Tables
0
20
40
60
80
100
120
140
US Bank
Major Manufacturer
Pre-Cancun
Cancun
Compressed Size (Lower is Better) H o u rs G BYet More New BLU Goodies in Cancun
•
Benefits
•
Performance
: BLU is ultra fast…. but using BLU to access the pre-computed result
of highly complex queries is even faster!
•
Compatibility
: Can ease adoption in existing databases that contain MQTs
•
Example
-- Column-organized source table:
CREATE TABLE t1col ( c1 INTEGER, c2 DATE, c3 INTEGER ) ORGANIZE BY COLUMN;
-- User-maintained MQT:
CREATE TABLE t1col_mqt AS
(SELECT c2, SUM(c3) FROM t1col WHERE c1 > 100 GROUP By c2) DATA INITIALLY DEFERRED REFRESH DEFERRED
MAINTAINED BY USER ORGANIZE BY COLUMN;
Yet More New BLU Goodies in Cancun (continued)
Using MERGE on Column-Organized Tables
•
Cancun enables additional flexibility in your ETL/ELT jobs with
support of the MERGE SQL statement
•
Examples
•
Merge from a single record source
•
Merge from a staging table
MERGE INTO blut r USING ( SELECT c1, c2 FROM blus ) s ON (r.c1 = s.c1)
WHEN MATCHED THEN UPDATE SET c2 = s.c2
WHEN NOT MATCHED THEN INSERT VALUES (s.c1, s.c2); MERGE INTO t1 USING ( VALUES (?,? ) ) AS t2 (pk, i1 ) ON ( t1.pk=t2.pk )
WHEN MATCHED THEN UPDATE SET i1=t2.i1
Yet More New BLU Goodies in Cancun (continued)
Easier Schema Change
•
With Cancun you can add new columns to column-organized tables with ALTER
TABLE
•
This has enabled further
ISV enablement
for BLU
•
Same syntax as for row-organized tables, eg.
ALTER TABLE t1
ADD COLUMN c4 INT
PRIMARY KEY DEFAULT 5 NOT NULL;
LOAD Recoverability Improvements
•
You can now use the LOAD with COPY option
•
LOAD operations that use the RESETDICTIONARYONLY option are now recoverable
• Note : the RESETDICTIONARYONLY option creates column-level dictionaries without loading any data
•
Benefits
• In the event of a failure, you can more easily recovery to your selected point-in-time, by rollingforward through these LOAD operations
Cancun and Power 8
Cancun Exploits Several Key Power 8 Features to Improve Performance & Efficiency
•
More cores per CPU ( 12 in P8 vs 8 in P7 )
•
More threads per core ( SMT8 in P8 vs SMT4 in P7 )
•
PCI-E Gen3 ( faster messaging and storage I/O )
•
New instructions designed that speed
up database processing
Example : New SIMD Instructions
• DB2 exploitation of new P8 VSX instructions speeds up predicate scanning by 2x
>2X faster per 100,000 16K pages
Example : New Fletcher64 Checksum
• Page verification pathlength reduced by 38% via exploitation of optimized checksum
calculation using new instructions
38% less CPU use during any page I/O
Workload 17% faster
Example : New Power8 Instruction Profiling
• New instruction reordering to optimize cache hit ratio, branch prediction efficiency
Cancun Feature Highlights
38
Get the answers you need in the moment
• Oracle compatibility for BLU
and other SQL enhancements
Extend the advantages of BLU to an even broader set of applications in your enterprize
•
Oracle compatibility for BLU
and other SQL enhancements
Extend the advantages of BLU to an even broader set of applications in your enterprize
• Get real-time answers to all questions with instant
insight into historical and “as it happens” data
• Improve performance while reducing complexity
and overhead
• Get fast time-to-value using skills you already have
for Oracle database
• Change the economics of continuous availability
with broad infrastructure choices at every price
point
• Superior performance at lower cost for SAP and
other environments
SQL Compatibility for BLU Acceleration : Highlights
•
Full use of compatibility vector support for column-organized tables, e.g.
db2set DB2_COMPATIBILITY_VECTOR=ORA
• Tells DB2 to behave in Oracle Compatibility mode
• Support for all compatibility vector options, including:
• Data Type Compatibility
• VARCHAR2, NVACHAR2, NUMBER (& DECFLOAT), DATE
• PL/SQL, OUTER JOIN OPERATOR (+), etc
• Most, but not all, natively in BLU runtime. Should take care when using:
• ROWNUM
• Hierarchical Queries (CONNECT BY)
• LIMIT OFFSET
•
Additional built-in functions now operate directly on columnar data
• Additional DECFLOAT Arithmetic Operations
• SUBSTRB, SUBSTR2, SUBSTR4
• LENGTH2, LENGTH4, LENGTHB
• INSTR, INSTR2, INSTR4, INSTRB and LOCATE_IN_STRING
• Multi-parameter versions : LTRIM, RTRIM, LCASE, TRANSLATE, ….
•
Not currently supported
• Character Based Column (CODEUINT32), Extended Row Size, ROWID, Compiled PL/SQL Triggers, Excludes Null Key indexes, Index On Expression indexes
Cancun Feature Highlights
40
Get the answers you need in the moment
• New pureScale deployment
options
pureScale on commodity networks New Virtualization Options
• HA and DR support for BLU
A simple local availability, and disaster recovery solution for your BLU databases
• Several other availability and
related advances, including …
Wider utility support Broader options for the
Geographically Dispersed pureScale Cluster (GDPC)
•
New pureScale deployment
options
pureScale on commodity networks
New Virtualization Options
•
HA and DR support for BLU
A simple local availability, and disaster recovery solution for your BLU databases
•
Several other availability and
related advances, including …
Wider utility support
Broader options for the
Geographically Dispersed pureScale Cluster (GDPC)
• Get real-time answers to all questions with instant
insight into historical and “as it happens” data
• Improve performance while reducing complexity
and overhead
• Get fast time-to-value using skills you already have
for Oracle database
• Change the economics of continuous availability
with broad infrastructure choices at every price
point
• Superior performance at lower cost for SAP and
other environments
pureScale with Commodity Ethernet
•
Prior to Cancun, pureScale required an Remote Direct Memory Access (RDMA) capable
cluster interconnect
• Unique IBM RDMA exploitation patterned after the mainframe Sysplex provides strong scalability, even for write-intensive workloads
• RDMA cluster interconnect options
• RoCE 10gb Ethernet, Infiniband
•
DB2 Cancun adds the option to run pureScale over a traditional Ethernet network
without RDMA capability
• Why ? Faster cluster setup and lower cost deployments using commodity network hardware
41
Infiniband Switch, or,
10G Ethernet Switch with 802.3 global pause control
Infiniband Adaptor, or,
10G Ethernet Adapter with RoCE support
Any Ethernet Switch that supports TCP/IP sockets
pureScale with Commodity Ethernet : Typical Use Cases
•
Mission critical workloads that do not need pureScale’s RDMA optimizations
• Eg. small clusters and/or moderate workloads where availability is the primary motivator for pureScale
• Provides the same level of availabilityas pureScale with RDMA
• Eg. Active/active, Online fast failover, online rolling updates, etc
•
Enterprise Database Grid
• Run multiple database workloads on a single pureScale cluster, to reduce total operational costs
• Individual databases typically require high availability, but may not require scaleout
• This deployment pattern typically has lower communication bandwidth needs
•
Virtual Environments
• Provides new options for deploying pureScale in virtual environments
• Great option to consolidate multiple pureScale clusters over the same set of machines
• New event-based communications model with commodity ethernet results efficient CPU sharing across the clusters
R e la ti v e t ra n s a c ti o n s p e r s e c o n d
90% reads / 10% writes transaction workload
La b te st s YM MV Members and CFs running 4 core x86, single 10gE adapter
Performance Example : Modest OLTP
Enterprize Data Grid
DB2 DATABASE A DB2 DATABASE B DB2 DATABASE N Passive until Failover Passive until Failover Passive until Failover Failover Failover FailoverWorkload A Applications Workload B Applications Workload C Applications
Single pool of members can serve all database workloads
CF
Member Member Member Member Member
DATABASE A DATABASE B DATABASE N
Shared Storage
•
Consolidate multiple workloads to on the same resource infrastructure
•
Save management and resource costs
Enterprize Data Grid : Benefits from pureScale’s Explicit
Hierarchical Locking (EHL)
•
EHL removes data sharing costs for tables/partitions that are only accessed
by a single member
• Automatically avoids CF communication if table/index sharing not occurring workload
•
Example target scenarios
• Consolidating multiple database workloads in a single pureScale cluster
• Multi-tenant workloads
• Directed batch
•
Enabled via new OPT_DIRECT_WRKLD database configuration parameter
• Detection of data access patterns happens automatically and EHL will kick in when data is not being shared after configuration parameter set
Member Member Member Member
DB 1 DB 2 DB 3 DB 4 CF CF No member/CF communication necessary
New Virtualized Deployment Options for pureScale
•
Previously, virtualized deployments of pureScale were limited to:
• AIX LPARs, with dedicated RDMA network adapters per partition
• KVM with RHEL, with dedicated 10 GE RoCE network adapters per partition
•
Additional virtualized deployments are now available in DB2 Cancun Release
with commodity Ethernet interconnect, including:
• AIX LPARs
• VMware with RHEL or SLES
• KVM with RHEL
•
Virtualized environments are perfect for:
• Production environments with moderate workloads
• Development systems
• QA and testing systems
Efficient CPU sharing. New event-based communication design means CF only consumes
CPU when responding to a member.
Workload Consolidation with Virtualization : An Example
Virtualized Deployments: Supported Configurations
Operating
System
Virtualization
Technology
Infiniband
RDMA capable
Ethernet
Commodity
Ethernet
AIX, SLES, RHEL No virtualization (bare metal)
Yes* Yes* Yes
AIX LPAR Yes* Yes* Yes
SLES VMware No No Yes
RHEL VMware No No Yes
KVM No Yes* Yes
* Dedicated interconnect adapter(s) per host/partition
•
VMware supported with
• Any x64 system that is supported by both the VM and DB2 pureScale
• Any Linux distribution that is supported by both the VM and DB2 pureScale
•
KVM supported with
• Any x64 system that is supported by both RHEL 6.2 and DB2 pureScale
pureScale with Commodity Ethernet :
Best Practices & Deployment Guidelines
•
New DBM configuration parameter CF_TRANSPORT_METHOD
gets set to either TCP or RDMA at instance creation time
•
Can be changed without rebuilding cluster
•
Prerequisites
•
10 Gigabit Ethernet strongly recommended
for most installations
•
1 Gigabit Ethernet supported, but advisable only for read intensive workloads
(95%+ read) , or very small system (eg. 2 member cores )
•
Set DB2_SD_ALLOW_SLOW_NETWORK=YES to enable 1GE implementations
•
Hosts must be on the same subnet
•
Hosts must be able to ping each other
●
Use bonded Ethernet adapters for
availability
● To avoid loss of a member or CF in the event of a single adapter failure
●
Use bonded Ethernet adapters for
performance
, example RoTs:
● On each CF: 10GbE adapter dedicated for every 12 member cores in the cluster ● On each Member: 10GbE adapter dedicated for every 12 cores on the member
●
As business workload increases, keep an eye on network
bandwidth consumption
● DB2 provides a comprehensive set of interfaces to monitoring network infrastructure (e.g. MON_GET_CF_WAIT_TIME, and friends)
● Add adapters as necessary
● Can consider converting the cluster to RDMA, in extreme cases
pureScale with Commodity Ethernet :
•
Active / active DR solution that looks like a single-system
•
Site to site distance limited to several 10s of kms
•
Due to synchronous communications
•
A 3
rdsite is used for quorum tiebreaking
•
Key benefits
•
Make use of DR system cycles; both sites can actively perform reads and writes
•
No conflicting updates, as in replicated systems
•
Continually test your DR system
All Co nnections ~2/3 Connect ions ~1/3 Con nection s ~1/2 Con
nections ~1/2 Connect
ions
M1 M3 CFP CFS M2 M4
Site A Site B
10s of km
Applications
•
More platform support
•
Fast I/O (a.k.a. SCSI-3 persistent reserve) fencing can now be
enabled without SAN access at the tie-breaking site
•
Benefit from pureScale’s ultra fast (10s of seconds) failover with a simpler
SAN connectivity design
•
Note : New lab services offering used for initial setup
•
Benefit from IBM’s experience is setting up this multi-site configuration
GDPC Enhancements in Cancun
AIX
Infiniband
RedHat Linux
10GE RoCE
AIX
10GE RoCE
SuSE Linux
10GE RoCE
53
Turn-key Availability and DR Solution for BLU
Primary Database Server Standby Database Server HTTP & App. Servers Standby HTTP & App. Servers
Perth
Melbourne
Client HADR sync
Primary Database Server
Standby Database Server Application Server
Transparent Automatic Re-route
HADR async HADR async
Sydney
Time-delayed apply•
DB2 HADR now supported
with BLU ! Use it to …
• Automatically maintain a DR site
• Maintain another local fast failover system
• Keep another system a few minutes behind, to quickly recover from logical errors
• One or all of these at the same time !
•
Comprehensive HADR feature
support, including …
• All sync modes
• Peer Window
• Multiple Standby
• Log Spooling
• Replay Delay
• Automatic Failover
• Automatic Client Re-route
• LOAD with COPY
Incremental Backup Support for pureScale
D D Inc D
Sun Mon Tues Wed Thur Fri Sat
Full D
•
Same concept as without pureScale
• Incremental : includes changes since last full backup
• Delta : includes changes since last backup of any type
• The DB2 Merge Backup utility supports pureScale as well
•
pureScale specific notes
• As with any type of backup, incrementals can be run on any member, and the images can be restored on any member
• Before Cancun, some topology changes required full database offline backup:
• DROP MEMBER
• An incremental or delta backup is now sufficient !
•
Reminder:
•
Set TRACKMOD configuration parameter to YES
Inc Full D D Logs Runtime Recovery DB2’s automatic recovery (or RECOVER command) does all these steps for you “under the covers”.
55
Integrated Flashcopy Support for pureScale
•
Same concept as without pureScale
•
Quick & easy way to take, manage and restore space efficient backups
•
Under the covers, suspends DB2 from doing write I/Os, and then issues
the storage commands necessary to take a flash copy (aka snapshot)
backup
•
In pureScale uses a space efficient GPFS file system snapshot
•
Examples
DB2 BACKUP DATABASE sample USE SNAPSHOT EXCLUDE LOGS DB2 RESTORE DATABASE sample USE SNAPSHOT
Side note: EXCLUDE LOGS has been optimized to not suspend write I/Os on the current log file (of course, database page
writes are still suspended). The net is
that you should see
even less impactto your workload during
the relatively short time that the snapshot
Parallel Version Upgrade
db2iupgrade -g myinstance
Significant Improvements in Version Upgrade Speed
•
Upgrading a pureScale instance is now
done automatically for all hosts, and in
parallel across all hosts in the instance
•
Can speed up upgrade by a factor n
• With n hosts
•
Reduces potential for human errors
•
Usage hints
•
Invoke from any member (not a CF)
• A database connect will be performed under the covers to the check the state of the database; this requires a member
•
As usual, run db2ckupgrade prior to
instance upgrade
Member Member Member Member
CF CF
Cancun Feature Highlights
57
Get the answers you need in the moment
• New optimizations for SAP
workloads
Developed in partnership with SAP
•
New optimizations for SAP
workloads
Developed in partnership with SAP
• Get real-time answers to all questions with instant
insight into historical and “as it happens” data
• Improve performance while reducing complexity
and overhead
• Get fast time-to-value using skills you already have
for Oracle database
• Change the economics of continuous availability
with broad infrastructure choices at every price
point
• Superior performance at lower cost for SAP and
other environments
Pre-Cancun SAP BW Support of BLU
BW Reporting Layer Enterprise Data Warehouse Layer PSA BI Reporting Tools Data Acquisition LayerPSA PSA PSA
Master Data BW Near-Line Storage (NLS) (old, infrequently accessed data) NLS InfoCube DSO DSO Source System 1 Source System 2 NLS DSO NLS InfoCube Info Cube Aggregate Info Cube Info Cube
•
Available with SAP BW 7.0
and higher
•
Supported BW Objects
• InfoCube • Aggregate • BW Temporary Tables•
Supported NLS Objects
• NLS InfoCube • NLS DSO•
Benefits
• Significatly faster and more stable BW query performance without tuning
• Space reduction
• Less Aggregates
Cancun is Designed for Comprehensive SAP BW Support
BW Reporting Layer Enterprise Data Warehouse Layer BI Reporting Tools Data Acquisition Layer BW Near-Line Storage (NLS) (old, infrequently accessed data) NLS InfoCube Source System 1 Source System 2 NLS DSO NLS InfoCube FLAT InfoCube DSO DSODSO PSA PSA PSA PSA Master DataTargeted in Cancun Design
•
Cancun Design Targets all
BW Objects, including:
- Flat InfoCube with SAP BW 7.40; - DataStore Object (DSO)
- Master Data - PSA
- InfoSet
•
Designed for SAP BW 7.0 and
higher
•
More benefits
- Support of all BW objects -More space reduction - Reporting on DSO
- BW Queries
- Faster BW queries on Classical IC, because Master data are on BLU - Faster BW queries with Flat IC
- ETL processing
- Less ETL processing steps
- Faster ETL with Cancun INSERT/ UPDATE/DELETE improvements - Faster data propagation into Flat IC
Aggregate
Info Cube
0
0
0
0
5
5
5
5
10
10
10
10
15
15
15
15
20
20
20
20
25
25
25
25
30
30
30
30
35
35
35
35
40
40
40
40
45
45
45
45
50
50
50
50
10.1 Row
10.1 Row
10.1 Row
10.1 Row
10.5 GA
10.5 GA
10.5 GA
10.5 GA
BLU
BLU
BLU
BLU
10.5 FP1
10.5 FP1
10.5 FP1
10.5 FP1
BLU
BLU
BLU
BLU
Cancun
Cancun
Cancun
Cancun
BLU
BLU
BLU
BLU
SAP 1TB BW : Cancun is now 74x Faster than 10.1 !
1105
sec
29s
22s
15s
74
x
L a b t e s t s Y M M V A ve ra g e Q u e ry E la p se d T im e ( se co n d s)Introducing DB2 “Cancun”
•
Get real-time answers to all questions with instant
insight into historical and “as it happens” data
•
Improve performance while reducing complexity
and overhead
•
Get fast time-to-value using skills you already have
for Oracle database
•
Change the economics of continuous availability
with broad infrastructure choices at every price
point
•
Superior performance at lower cost for SAP and
other environments
61
Get the answers you need,… in the moment
With BLU Shadow Tables
Measured 13%-330%
Faster over Several
Workloads
With Oracle Compatibility
for BLU
pureScale on Commodity
Networks; HADR for BLU;
Lots More
BW Measured 74x Faster;
Comprehensive Solution
But Wait, There’s More, For Example,…
String Types based on “Logical Characters”
•
Allows applications to be coded using true character counts as opposed to byte
counts, in multi-byte Unicode environment
•
Why ?
Application Development Simplicity; Compatibility
•
Specified via new string attribute
• CODEUNITS32 – A new string unit for the length attribute which means the maximum length is in UTF-32 code units. The data continues to be encoded as UTF8/UTF16, and processing as a UTFUTF-32 string occurs only when needed. (CHAR synonym for CODEUNITS32 in ORA mode)
• CODEUNITS16 – Length attribute is double-byte based. Corresponds to existing graphic string semantics
• OCTETS – Length attribute is byte based. Corresponds to existing char string semantics
63
CREATE TABLE monument(native VARCHAR(20 CODEUNITS32),
english VARCHAR(20 OCTETS)); SELECT LENGTH(native) AS totalen,
SUBSTR(native, 1, 2) AS first2chr
FROM monument WHERE english = ‘Great Wall of China’; TOTALEN FIRST2CHR --- ---4 万里 Native English Кремль Kremlin 万里长城 Great Wall of China ताज महल Taj Mahal
Monitoring Improvements in Cancun
•
New metrics to simplify sort memory fine-tuning for BLU
•
New metrics to identify the largest amount of memory used by any individual sort
consumer
• sort_consumer_heap_top, sort_consumer_shrheap_top
•
New metrics to monitor number of concurrent sort operations
• active_sort_consumers, active_sort_consumers_top
•
Available at new reporting levels to monitor sort consumption (eg. connection /
query):
• sort_heap_allocated, sort_shrheap_allocated, sort_heap_top, sort_shrheap_top
•
New time spent metrics
•
Online backup time
• total_backup_time / total_backup_proc_time / total_backups
•
Index creation / recreation time
• total_index_build_time / total_index_build_proc_time / total_indexes_built
Reminder :
DB2_WORKLOAD=ANALYTICS automatically sets an initial sort configuration that is appropriate for typical analytic workloads. Consider using these new parameters
Enforced Backup Encryption
•
Allows a security administrator to ensure that any backup
taken by a DBA is encrypted
•
Easier adherence to security standards and/or regulatory requirements
•
Works with Infosphere Guardium Data Encryption
•
Simple use: a person with SECADM authority does this:
•
Only a SECADM can change these parameters
•
When ENCRLIB is set, every BACKUP will be encrypted
•
SQL2459N is returned if the user tries to override this using a different
ENCRLIB option on backup command.
•
Enabled in DPF, pureScale, single node
DB2 UPDATE DB CFG FOR sample USING ENCRLIB /opt/vormetric/eet/agent/lib/libeeetdb2.so DB2 UPDATE DB CFG FOR sample USING ENCROPTS “keylength=256”