IBM DB2 Analytics Accelerator
© Copyright IBM Corporation 2011. All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE
INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.
IBM, the IBM logo, ibm.com, DB2, InfoSphere, Cognos, and InfoSphere Warehouse on System z are trademarks or
registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
Other company, product, or service names may be trademarks or service marks of others.
Would You Still Use Google
If It Took 3 Days And 7 People
Nearly 70%
of data warehouses experience
performance-constrained issues of various types.
- Gartner 2010 Magic Quadrant
months to deploy
specialized resources required
constant tuning
Traditional Systems Landscape
OLTP Staging Area Operational
Data Store
Enterprise DWH
Data Marts
ETL
ETL ETL ETL ETL
Applications
Historical reasons:
•Different access patterns •impact on performance
•EDW as the data integration hub •again, impact on performance
Negative ramifications: •Complexity
•both in systems management and in applications •Difficulties in supporting real time analytics
Too complex infrastructure
Too complicated to deploy
Too much tuning required
Too inefficient at analytics
Too many people needed to maintain
Too costly to operate
Traditional data warehouses
They are based on databases optimized for transaction processing—
NOT to meet the demands of advanced analytics on big data.
are just too complex
Data continues to
expand exponentially.
Analytics are becoming more complex as
business demands faster answers.
The right data warehouse
Visionary Systems Landscape
OLTP Staging Area Operational
Data Store
Enterprise DWH
Data Marts
ETL
ETL ETL ETL ETL
Applications
Benefits
• Consolidating all the components into a single system
• Uniform access to any data
• Efficient data movement within the system (ideally, no network)
• Opportunity to remove some of the components
Challenges
• Mixed workload management capabilities • Universal processing capabilities to deliver
best performance for both transactional and analytical workloads
• Providing industry leading availability, security and reliability to all types of workload
System z Data Sharing and Parallel Sysplex technology provides all the needed characteristics except one:
ISAOpt V1 Needed Following Enhancements
1. Increase applicability by relaxing current off-load restrictions
2. Increase applicability by supporting larger amount of data
3. Support concurrent query execution
4. Improve data currency
5. Support disaster recovery
IBM DB2 Analytics Accelerator
Capitalizing on the best of both worlds – System z and Netezza
How is it different
Performance: Unprecedented
response times to enable 'train of
thought' analyses frequently blocked by
poor query performance.
Integration: Connects to DB2 through
deep integration providing transparency
to all applications.
Self-managed workloads: queries are
executed in the most efficient way
Transparency: applications connected
to DB2 are entirely unaware of the
Optimizer
Simplified administration: appliance
hands-free operations, eliminating many
database tuning tasks
What is it?
The IBM DB2 Analytics Accelerator is a workload optimized, appliance add-on, that enables the integration of business insights into operational processes to drive winning strategies. It accelerates select queries, with unprecedented response times.
IDAA Preserves ISAOpt V1 Key Value Propositions
DB2 continues to own data (both OLTP and DW)
• Access to data (authorization, privileges, …)
• Data consistency and integrity (backup, recovery, …)
• Enables extending System z QoS characteristics to BI/DW data as well
Applications access data (both OLTP and DW) only through DB2
• DB2 controls whether to execute query in DB2 mainline or route to ISAO
• DB2 returns results directly to the calling application
• Enables mixed workloads and selection of optimal access path (within DB2
mainline or ISAOpt/IDAA) depending on access pattern
IDAA continues to be implemented as DB2 internal component
• DB2 provides key IDAA status and performance indicators as well as typical
administration tasks by standard DB2 interfaces and means
• No direct access (log-on) to IDAA
• Enables operational cost reduction through skills, tools and processes
z/OS on
System z
Superior availability reliability, security,
Workload mgmt
Deep DB2 Integration within zEnterprise
Data
Manager
Buffer
Manager
IRLM
Log
Manager
IBM
DB2
Analytics
Accelerator
Applications
DBA Tools, z/OS Console
, .... . .
Operational Interfaces (e.g. DB2 Commands) Application Interfaces (standard SQL dialects) Netezza 1000 HW Industry leading DW performance, ease of useDB2 for z/OS
Query Execution Process Flow
DB2 for z/OS
Optimizer ISAO in te rface Application Application Interface HeartbeatQuery execution run-time for queries that cannot be or should
IDAA - Bringing Netezza AMPP
TMArchitecture to DB2
Advanced Analytics DBA Legacy Reporting BI FPGA Memory CPU FPGA Memory CPU FPGA Memory CPUSMP
Host
Disk
Enclosures
S-Blades™
Network
Fabric
IBM DB2 Analytics Accelerator
DB2 for
z/OS
•V9
•V10
AMPP = Asymmetric Massively Parallel Processing
Netezza 1000
IBM DB2 Analyt ics Acc elerat or IB M D B 2 Ana lytics A cce le ra to rThe Key to the Speed
FPGA Core
CPU Core
Uncompress Project Restrict, Visibility
Complex 㺌㺌㺌㺌 Joins, Aggs, etc. select DISTRICT, PRODUCTGRP, sum(NRX) from SALES where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO‘ group by … Slice of table SALES (compressed) Slice of table SALES (compressed)
SMP Hosts
Snippet Blades
TM(
S-Blades, SPUs)
Disk Enclosures
Netezza 1000 basic design aspects
•IDAA Server •SQL Compiler •Query Plan
•Optimize Administration
2 front/end hosts, IBM 3650M3 clustered active-passive
2 Nehalem-EP Quad-core 2.4GHz per host
•Processor & streaming DB logic •High-performance database engine
streaming joins, aggregations, sorts, etc.
e.g. TF12: 12 back/end SPUs
•Slice of User Data
•Swap and Mirror partitions •High speed data streaming •High compression rate
EXP3000 JBOD Enclosures
12 x 3.5” 1TB, 7200RPM, SAS (3Gb/s)
max 116MB/s (200-500MB/s compressed data) e.g. TF12:
8 enclosures 㸢96 HDDs
Disk Mirroring and Failover
All user data and temp space mirrored
Disk failures transparent to queries and transactions
Failed drives automatically regenerated
Bad sectors automatically rewritten or relocated
Primary
Mirror
Connectivity Options
Multiple DB2 systems can connect to a single IDAA
A single DB2 system can connect to multiple IDAAs
• residing in the same LPAR Multiple DB2 systems can connect to multiple IDAAs
Full flexibility for DB2 systems:
Better utilization of IDAA resources
Network Configuration Options
Option 1 – Simple Direct attachment
– Virtual IP definition both on System z and Netezza
– Only one network link active at a time – if Netezza fails over to standby host, connection might get lost
OSA ports are configured such that only one link is active at a time. If the connection on this link breaks, the other is activated
SMP Host 1 SMP Host 2 OSA port 1 port 2 10G bE port 1 port 2 10G bE port 1 port 2
Option 2 – Additional redundancy or additional CEC requires Switch(es)
– Can address cable failures and Netezza fail over to standby host – For higher availability requirements, a second switch is required
SMP Host 1 SMP Host 2 OSA port 1 port 2 10G bE port 1 port 2 10G bE port 1 port 2 Switch 1 Switch 2 OSA port 1 port 2 IDAA IDAA System z System z
OSA ports are configured such that packages are sent
alternating, using both ports: “multipathperconnection” configuration
Network Configuration Options
(continued)
Private Data Network
10 GbE OSA-Express3 10 GbE OSX Ethernet
Switch Private Service Network
TOR Switch
Option 3 – zBX TOR Switch
– For clients with an installed zBX connected to zEnterprise, the
top-of-rack switch may be leveraged to connect to the IDAA
– Connection between zBX and IDAA can be direct or switched
© 2012 IBM Corporation 24 16.04.2012
Feedback from Beta Customer: Fast Time to Value
IBM DB2 Analytics Accelerator (Netezza 1000-12)
Î
Production ready - 1 person, 2 days
Table Acceleration Setup …
2 Hours
– DB2 “Add Accelerator”
– Choose a Table for “Acceleration”
– Load the Table (DB2 copy to Netezza)
– Knowledge Transfer
– Query Comparisons
Initial Load Performance …
Î
5.1 GB in 1 Min 25 Seconds (24M rows)
400 GB in 29 Min (570M rows)
Actual Query Acceleration …
1908x faster
Î
2 Hours 39 Minutes to 5 Seconds
CPU Utilization Reduction …
99% less CPU
Î
24M rows: 56.5 CPU seconds to 0.4 CPU
Large Insurance Company
Adding value by Accelerating the Delivery of Business Reporting
Initial Load Performance 400 GB Loaded in 29 Minutes
570 Million Rows (Actual: Loaded 800 GB to 1.3 TB per hour)
Extreme Query Acceleration - 1908x faster IBM DB2 Analytics Accelerator (Netezza 1000-12)
Production ready - 1 person, 2 days Table Acceleration Setup in 2 Hours
- DB2 “Add Accelerator”
With Accelerated Time to Value
Times Faster Query Total Rows Reviewed Total Qualifying Rows Total Rows
Returned Hours Sec(s) Hours Sec(s)
Query 1 591,941,065 2,813,571 853,320 2:39 9,540 0.0 5 1,908 Query 2 591,941,065 2,813,571 585,780 2:16 8,220 0.0 5 1,644 Query 3 813,343,052 8,260,214 274 1:16 4,560 0.0 6 760 Query 4 283,105,125 2,813,571 601,197 1:08 4,080 0.0 5 816 Query 5 591,941,089 3,422,765 508 0:57 4,080 0.0 70 58 Query 6 813,343,052 4,290,648 165 0:53 3,180 0.0 6 530 Query 7 591,941,065 361,521 58,236 0:51 3,120 0.0 4 780 Query 8 813,343,052 3,425,292 724 0:44 2,640 0.0 2 1,320 Query 9 813,343,052 4,130,107 137 0:42 2,520 0.1 193 13 DB2 Only DB2 with IDAA
ISAO V1 Needed Following Enhancements
1. Increase applicability by relaxing current off-load restrictions
2. Increase applicability by supporting larger amount of data
3. Support concurrent query execution
4. Improve data currency
5. Support disaster recovery
6. DB2 10 support
IDAA addresses all of them!
IDAA addresses all of them!
See further documentation for details
See further documentation for details
Up to 32TB of
Up to 32TB of
uncomressed
uncomressed
data,
data,
e.g. with 1:4 compression ratio, up to 128TB of user data
e.g. with 1:4 compression ratio, up to 128TB of user data
Exploiting
Exploiting
Netezza
Netezza
workload management capabilities
workload management capabilities
Partition
Partition
-
-
scope update
scope update
Building blocks provided
Building blocks provided
IDAA supports both DB2 9 and DB2 10*
IDAA supports both DB2 9 and DB2 10*
Requires
You can also benefit!
•
Are there
long running queries
that could provide business
value if they could be run in seconds vs
30 minutes or more
?
•
What about
performance challenges
with
complex and ad
hoc queries
?
•
Is
modernization of the data warehouse
or
Operational
Business Analytics
a topic of interest?
•
Are there thoughts about
extending the use of operational
platform data
to perform
business analysis
and
daily
reporting
?
•
Is there
a data warehouse running on System z
or an
intention to do so?
•
Is an OLAP application
running out of steam
due to growth in
data. These are usually a single subject area such as
Accounting, Sales or Inventory.
•
The forgotten query
: Have queries been elected to set aside
due to performance challenges?
If you answer one or more of the
following questions with ‘yes‘ please give
me another 3 minutes to show you how
we can help you analyzing your
acceleration potenzial
Quick Workload Test
Customer
Collecting information from dynamic statement cache, supported by step-by-step instruction and REXX script (small effort for customer)
Uploading compressed file (up to some MB) to IBM FTP server
IBM / Center of Excellence
Importing data into local database Quick analysis based on known DB2
Analytics Accelerator capabilities
Report for a first assessment: Acceleration potential for
Queries Estimated time CP cost
Report for a first assessment: Acceleration potential for
Queries Estimated time CP cost
Workload Assessment Output: Sample PDF Report
How much of the current elapsed time may run on
Smart Analytics Optimizer
Detailed query-level assessment of the
workload
Elapsed time per query
SQL statement per query Summary based on queries, query blocks, elapsed time and CPU
time
Reasons why certain query blocks might not run on Smart Analytics
In eigener Sache...
Zusammen mit den Experten der IBM Technical Support Organisation haben unsere
deutschsprachigen IBM System z Spezialisten ein umfassendes Themenspektrum zu aktuellen
Neuerungen und integrierte Software- und Hardware-Lösungen rund um den legendären Mainframe ausgewählt.
Wir freuen uns darauf, Ihnen Informationen in XXL sowie brandheißes Wissen aus erster Hand zu
präsentieren. In insgesamt 10 Tracks erfahren Sie, wie Sie das Potenzial der leistungsstarken Plattform für Ihr Unternehmen maximal nutzen können. Namhafte Sprecher der IBM zeigen Ihnen anschaulich, wie die neuesten Entwicklungen in den Bereichen der IBM Mainframetechnologie dazu beitragen können,
Innovationen voranzutreiben, Kosten zu sparen und Ihre bestehenden Investitionen, geschäftskritischen Daten sowie Prozesse unternehmensweit zu schützen.
Und das alles zu einem sensationellen Preis: Profitieren Sie bis zum 15. März 2012 vom exklusiven
Frühbucherrabatt von 199,– € für beide Tage und melden Sie sich am besten gleich hier an!
2 Ev
ents
1 Pr
eis
SYSPLEX Tables of App 1 Tables of App 2 Tables of App 3 Tables of App 4 Tables of App 5 DSG Member 1 DSG Member 2 App 1 App 2 App 3 App 4 App 5 Switch Switch Short Range IDAA Instance 1
Disaster Recovery Considerations (1 of 2)
Short Range Short Range
Short Range Long Range Tables of App 1 Tables of App 2 Tables of App 3 IDAA Instance 2 Tables of App 4 Tables of App 5
SYSPLEX Tables of App 1 Tables of App 2 Tables of App 3 Tables of App 4 Tables of App 5 DSG Member 1 DSG Member 2 App 1 App 2 App 3 App 4 App 5 Switch Switch Short Range IDAA Instance 1
Disaster Recovery Considerations (2 of 2)
IDAA Instance 2
Short Range Short Range
Short Range Long
Range
Tables Tables Tables
App 1
App 2