Everything You Wanted To Know About DB2 Prefetch
Jeffrey Berger, IBM
Session Code: B09
Tuesday, 17 November 1015, 16:30 Platform: z/OS
Acknowledgements and Disclaimers
Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates.
The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant.
While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.
All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.
© Copyright IBM Corporation 2014. All rights reserved.
— U.S. Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
— Please update paragraph below for the particular product or family brand trademarks you mention such as WebSphere, DB2,Maximo, Clearcase, Lotus, etc
IBM, the IBM logo, ibm.com, andDB2 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or TM), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at
•“Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
3
Objectives
• Contrast the different types of DB2 prefetch
• Understand when each type is used
• Sequential detection and dynamic prefetch, how does it work
• Explain the causes of sequential synch I/Os and possible solutions to avoid them
• How to manage the prefetch engines
• What will Cypress do for you
4
DB2 trend: Since DB2 9, DB2 is tending to run out of prefetch engines more often. Why?
• DB2 9
• Data Manager replaced sequential prefetch with dynamic prefetch for all index-to-data access for improved sequential detection (and to avoid needless prefetch)
• DB2 10
• Row level sequential detection
• Progressive prefetch quantity
• Index I/O parallelism
• Prefetch engines became zIIP eligible
Synchronous versus asynchronous I/Os
• If a Getpage does an I/O, it waits synchronously
• DB2 supports one I/O per thread
• DB2 cannot read or write multiple pages in a single synchronous I/O
• DB2 uses “prefetch engines” to do read I/O asynchronously
5
Types of DB2 prefetch
• Sequential prefetch
• Fastest way to sequentially prefetch data from disks
• Good with DFSMS striping
• List prefetch
• Pages may be skip sequential or random
• Fast when used in conjunction with IBM’s List Prefetch Optimizer, especially with flash storage
• Dynamic prefetch
• Triggered by sequential detection
• Progressive prefetch quantity, good for queries when the number of sequential pages needed might be small
• Faster than sequential prefetch if no disk prefetching is needed
Access path EXPLAIN table: PREFETCH column
• PREFETCH column pertains to data processing, not index processing
• PREFETCH = ‘D’, ‘S’ or blank
• Sequential detection, dynamic prefetch
• PREFETCH = ‘L’
• List prefetch (ordered RID list scan)
• If ACCESSTYPE=‘R’ (R-scan, or table scan), PREFETCH will be ‘S’, which means sequential prefetch
7
Sequential prefetch
• Sequential prefetch is managed by buffer manager
• Getpages trigger a sequential prefetch I/O when the page number is a multiple of the prefetch quantity
• No prefetch will be scheduled until the first buffer miss
• The first buffer miss will schedule two prefetch I/Os
• The first I/O reads from the current page to the next page number that is a multiple of the current page to the next trigger page
• The second I/O reads from the next trigger page to the following trigger page
• Thereafter, a triggering Getpage will schedule one I/O that starts at the following trigger page
Dynamic prefetch
• Not managed by buffer manager, and not scheduled by Getpage
• Dynamic prefetch is a common service used by data manager and index manager, prior to a Getpage
• One I/O triggered at a time.
• The first prefetch I/O reads from the current page onward.
• Subsequent I/Os “read ahead” starting from the last page that was
prefetched, and such that two I/Os will be overlapped if the query is I/O bound, but not overlapped if the query is CPU bound.
9
Progressive dynamic prefetch quantity
• First prefetch I/O reads 8 pages (32KB)
• Second prefetch I/O reads 16 pages (64KB)
• Subsequent prefetch I/Os read 32 pages (128KB)
• Progressive prefetch quantity applies to both index and data
11
Dynamic prefetch: Sequential detection What is a page-sequential Getpage?
• The next Getpage is considered as page-sequential if it is within 16 pages (64K) of the previous Getpage
• The next Getpage is also considered as page-sequential if it is equal to then N-2 Getpage
• DB2 maintains a running page-sequential count for the last 8 Getpages
• If the next qualified row is in the same page as the previous row,
increment the page-sequential count (Row Level Sequential Detection)
• Non-leaf index pages are not counted
PNO (Page Number) +16 -16
12
Dynamic prefetch: Sequential detection
When is the first dynamic prefetch request made?
• When DB2 is about to do a Getpage for a page-sequential page that causes the page-sequential count to equal 5, and
• There is a minimum of 3 Getpages (only 2 in DB2 10)
• If this PNO is less than the previous PNO, prefetch backwards
Prefetched pages
Initial prefetch (forward) - example
• DB2 prefetches forward if the current PNO is greater than the previous PNO
• Prefetch forward pages 100 through 107
• Now the “forward prefetch window” is 104-107 and the “backward prefetch” window is 93-96
• Any Getpage within the “current window” will be classified as sequential
• Henceforth, any Getpage that falls within the forward or backward prefetch window will trigger the next prefetch request
Backward prefetch window
Forward prefetch window
Getpages
92 96 100 104 108
--- ---
1312
Current window
Second prefetch (forward)
• Prefetch quantity is 16 pages
• Prefetch 16 pages starting from page 108
• Shift these numbers to the left
• Now the prefetch windows are 108-123 and 96-100
Prefetched pages Backward
prefetch window
Forward prefetch window
Getpages
96 100 104 108 124
---
Current window
15
Fragmented segments in a segmented table scan
• May cause DB2 to toggle between sequential and dynamic prefetch, resulting in sequential synch I/Os
• To avoid this problem, do not store multiple large tables in a single table space
• With DGTT (Dynamic Global Temp Table) work files, DB2 will read whole segments using “command prefetch”
• Only the segments that are needed are prefetched
• Command prefetch is counted by buffer manager the same as dynamic prefetch
DB2 buffer pool statistics
• # of random Getpages and # of sequential Getpages
• # of random synch I/Os and # of sequential synch I/Os
• Sequential prefetch: # of requests, # of I/Os and # of pages read
• Dynamic prefetch: # of requests, # of I/Os and # of pages read
• List prefetch: # of requests, # of I/Os and # of pages read
DB2 buffer pool accounting data
• Total # of getpages
• Total # of synch I/Os
• # of sequential prefetch requests
• # of dynamic prefetch requests
• # of list prefetch requests
• # of pages prefetched “asynchronously”
Prefetch I/O quantities
• Sequential prefetch reads up to 64 pages for SQL and up to 128 pages for utilities
• The maximum quantity is used for SQL if there are at least 160MB of sequential buffers
• The maximum quantity is used for utilities if there are at least 320MB of sequential buffers
• Also, in order to get utilities to write 128 pages, you need 320MB of sequential buffers.
• Optimal utility performance requires 320MB.
• Dynamic prefetch reads up to 32 pages (128KB)
• List prefetch reads up to 32 pages (128KB), except for Fast Log Apply which reads up to 128 pages
17
Sequential buffers
• The purpose of sequential buffers is to perform prefetch I/O
• VPSEQT determines the maximum number of sequential buffers.
• The default is 80% of VPSIZE.
• The number of sequential buffers are limited to enable the rest of the buffer pool to be used as a cache for random buffers.
• How many sequential buffers do you need?
• If each sequential I/O needs 32 pages, then you need 32 times the number of concurrent sequential I/Os
• Examples:
• 100 concurrent I/Os of 32 pages each requires 3200 buffers
• 200 concurrent I/Os of 32 pages each requires 6400 buffers
• 200 concurrent I/Os of 64 pages each requires 12800 buffers
• 200 concurrent I/Os of 128 pages each requires 25600 buffers
19
Index scans
• Index scans read the pages in key sequential order
• N-1 non-leaf page contains a list of leaf pages
• Prefetch is used for leaf pages, not non-leaf pages, but non-leaf pages are often dragged into the DB2 buffer pool by dynamic prefetch which is
reading ahead.
• Sequential detection on the leaf pages
If the index is “organized” or mostly organized, dynamic prefetch is used to read the organized leaf pages. Synch I/Os for disorganized pages.
Index manager detects if too many random synch I/Os are occurring for leaf pages, in which case it will switch to list prefetch
− If 5 out of 8 Getpages result in synch I/Os, DB2 switches to list prefetch.
− Non-leaf pages will still be read synchronously and the Getpages will be classified as random.
Alternative ways to do a key range scan of the data
1. Table scans using sequential prefetch
2. Probe index for x and process the rows in key sequential order
Use sequential detection with dynamic prefetch for clustered pages, synch I/O for unclustered pages
Process the data and index simultaneously
Check to ensure the key in the row matches the key in the index No need to sort result set if the sorting key matches the index key Not a good strategy if index cluster ratio is poor
3. Sorted RID list
First process the index and build a sorted RID list that contains those RIDs qualified by the index predicates
Use list prefetch I/O to read the rows
EXAMPLE: WHERE key_column BETWEEN x AND y (…SORT BY key_column)
21
Filter predicates
1. Dynamic prefetch is best if the cluster ratio is high and the qualified rows are dense
• Otherwise, DB2 will may read rows synchronously one page at a time, or
• DB2 will prefetch skipped pages, thereby consuming more buffer pool space
2. List prefetch is best if the cluster ratio is low or the qualified rows are sparse
EXAMPLE:
WHERE (key_col1 BETWEEN x AND y) AND key_col2 = :HV
LRU processing
• DB2 has a mechanism to prevent sequentially accessed data from monopolizing the BP and pushing out useful random pages
R S
R R
R R
S S
S S
Oldest sequential buffer
Oldest buffer
• If the length of the Sequential LRU chain (SLRU) exceeds VPSEQT, buffer manager will steal the oldest sequential buffer
Getpage classification
• Prefetch always classifies buffers as sequential.
• DB2 11 applies the philosophy that READ(YES) Getpages should be classified as sequential if DB2 expects a page to have been prefetched
• Sequential synch I/Os should be rare
• A sequential synch I/O results when a sequential Getpage does not find the page in the buffer pool
• Ordinarily this should not happen if the page was prefetched
23
Causes of sequential synch I/Os
• DB2 supports 600 prefetch engines for the whole DB2 subsystem. You may run out of engines.
• Check for messages in OMPE (“PREFETCH DISABLED - NO PREFETCH ENGINES”)
• When 90% of the buffers are “in use” by dirty pages or outstanding Getpages or outstanding prefetch I/Os, prefetch scheduling is disabled
• Check for messages in OMPE (“PREFETCH DISABLED - NO BUFFERS”)
Other causes of sequential synch I/Os for which there are no obvious indicators
• If a buffer pool has an insufficient number of sequential buffers, prefetched pages may be stolen before a Getpage uses the page
• A zIIP capacity shortage may cause such severe scheduling delays that the Getpages could occur before the prefetch engine gets scheduled.
• The application is busy doing other things after initiating prefetch I/O
• Evidence might be that class 1 elapsed time far exceeds class 2 time
• Multiple fragmented tables in a segmented table space
A small percentage of sequential synch I/Os may be “false”.
•“False” means that DB2 should not have classified the page as sequential
25
False indicators of sequential synch I/Os
• “False” meaning that prefetch was not really used and the Getpage should not have been classified as sequential
• Examples:
• RID list scan of a single page. This causes a synch I/O that should be classified as random (because prefetch is not scheduled), but it is classified as sequential.
• Uncommon cases where dynamic prefetch read the wrong pages
DB2 10: Insert index I/O parallelism
• Insert index I/O parallelism uses conditional Getpages for non-leaf index pages
• A conditional Getpage resulting in a buffer miss will schedule prefetch to read the page
• These prefetch requests are classified as sequential prefetch, even though the pages are random and only a single page is prefetched
• Subsequent Getpages for the prefetched non-leaf pages are classified as random
• Insert parallelism increases the usage of prefetch engines
• Omegamon subtracts the conditional Getpages from the total Getpage count.
• INDEX_IO_PARALLELISM=NO will disable this function
27
What can you do to avoid sequential synch I/Os?
•Reduce the number of prefetch streams
•Increase the number of sequential buffers by increasing VPSEQT or VPSIZE
•Schedule the work that is using prefetch to a different time of day
•Disable Insert Index I/O Parallelism
•If the zIIP overflow to CP (APPL% IIPCP in the RMF Workload Activity Report) is high, add more zIIP capacity
• IBM recommended default values for ZIIPAWMT:
• With Hiperdispatch : 3200 microsecs (lowest allowed 1600 microsecs)
• Without Hiperdispatch : 12000 microsecs
•Consider using VPSEQT(0) or PGSTEAL(NONE) for those DB2 objects that are causing lots of prefetch requests without doing any I/O
Prefetch scheduling
• Buffer manager will never schedule sequential prefetch if all the pages are in the buffer pool. The first Getpage buffer miss will result in the first
sequential prefetch I/O.
• Dynamic and list prefetch will always schedule the prefetch engine, often resulting in requests that do no I/O
• When DB2 9 switched some access paths from sequential to dynamic
prefetch, it increased the number of prefetch requests that did no I/O, and increased the likelihood of running out of prefetch engines
• Since V9, sequential prefetch is only used now for table scans and for insert index I/O parallelism
• Scheduling prefetch requests with no I/O wastes some CPU time
• Prefetch scheduling uses latch classes 10, 14, 23, and 24
29
What does a prefetch engine do?
• Get dispatched
• Get prefetch latch
• Search the LBP (Local Buffer Pool) for all pages
• If GBP dependent, for pages not in the LBP, search for them in the GBP (Group Buffer Pool)
• If at least one page is not in the buffer pool, do prefetch I/O
• Lots of CPU time to search the buffer pools
• Since DB2 10 the prefetch engine has been zIIP-eligible
• You need plenty of zIIP capacity
• The z13 will enable you to increase your zIIP capacity without having to increase your CP capacity
What if “most” prefetch requests result in no I/Os?
• Needless prefetch requests will waste some CPU time and cause extra latch contention for latch classes 10, 14, 23, and 24
• Most of that time is zIIP-eligible or uncaptured time. Some of it is class 2 CPU time.
• If the problem gets bad enough, you may run out of prefetch engines
• A shortage of zIIP capacity may tie up these prefetch engines and increase the likelihood that your will run out of prefetch engines.
• Scheduling delays may become the cause of sequential synch I/Os.
31
Turning off prefetch
• Use VPSEQT(0)
• Useful for in-memory buffer pools
• Very slow to reload buffer pool after a restart
• PGSTEAL(NONE)
• Buffer pool reloaded using prefetch when an object is opened
• Useful when object size is not growing
• Requires careful monitoring
• Not good for general usage
• Watch this space in Cypress
Future: Cypress
33
Jeffrey Berger IBM
[email protected]
Everything You Wanted To Know About DB2 Prefetch
Please fill out your session evaluation before leaving!