• Lets look at the system I mentioned earlier
• Only two critical things to monitor
• Number of applications executing (every 10 secs)
• Length of backup time
• Other monitors run hourly
• Every thing is running smoothly
• A new job goes in to delete historic data
• Intermittent performance issues
36
37
Example 2 – Performance Issue
• What is different today to any other day?
• When are the differences?
• Do they correlate?
• What are my options to look at the information?
• Excel or SQL on a database?
• Which is easier?
• Matching tables in Excel or SQL?
• Getting those Excel functions just right or SQL functions?
• Which is repeatable?
The repeatable issue is a key one. If a problem has happened once it will probably happen again. A performance data warehouse with the right views etc will make the repeat investigations very quick. Load the data and out comes the result.
37
38
Example 2 – Performance Issue
• I believe it is SQL
• Select * from <table>
where table_name = ‘<table name>’
And <date / time criteria>
• This gets me the data I want
• I can then look at the differences
• It is also repeatable
38
39
Example 2 – Performance Issue
• In this case it was a significant increase in page reorgs
• Did you know deletes can cause inserts to page reorg?
• It causes a problem for mass deletes
• I now know what it is, but is it every table in the database?
• Create the table on the next slide
39
40
Example 2 – Performance Issue
SNAP_TIME TIMESTAMP ROWS_WRITTEN BIGINT ROWS_READ BIGINT OVERFLOW_ACCESSES BIGINT TABLE_FILE_ID INTEGER TABLE_TYPE INTEGER PAGE_REORGS BIGINT
TABLE_NAME VARCHAR128 TABLE_SCHEMA VARCHAR128
This table will contain all the information I require to diagnose the issue with the page reorgs. There is also some extra data – but who knows if I will need it?
40
41
Example 2 – Performance Issue
• First thing is to get the range of snapshot timestamps I want to use.
• This gets the first one after 0900 and the first after 15:00
with early (starttime) as (select min(snap_time) from tabsnap where time(snap_time) > '09:00'),
late (endtime) as (select min(snap_time) from tabsnap where time(snap_time) > '15:00'),
with early (starttime) as
(select min(snap_time) from tabsnap where time(snap_time) > '09:00'), late (endtime) as
(select min(snap_time) from tabsnap where time(snap_time) > '15:00'),
firstsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select starttime from early)), lastsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select endtime from late))
select l.tabschema, l.tabname, l.pagereorgs, f.pagereorgs, l.pagereorgs -f.pagereorgs as diff
from lastsnaps l, firstsnaps f where l.tabschema = f.tabschema and l.tabname = f.tabname
;
41
42
Example 2 – Performance Issue
• Note that this is a continuation of the same statement.
• Then get the table information that I want.
• At this stage I am just looking at the overall day
• So I am putting the first snapshots into ‘firstsnaps’ and last into
‘lastsnaps’
firstsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select starttime from early)), lastsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs from tabsnap where snap_time = (select endtime from late))
with early (starttime) as
(select min(snap_time) from tabsnap where time(snap_time) > '09:00'), late (endtime) as
(select min(snap_time) from tabsnap where time(snap_time) > '15:00'),
firstsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select starttime from early)), lastsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select endtime from late))
select l.tabschema, l.tabname, l.pagereorgs, f.pagereorgs, l.pagereorgs -f.pagereorgs as diff
from lastsnaps l, firstsnaps f where l.tabschema = f.tabschema and l.tabname = f.tabname
;
42
43
Example 2 – Performance Issue
• Then select the actual information
• I could order it by the biggest difference to get an immediate picture of the worst tables
select l.tabschema, l.tabname, l.pagereorgs, f.pagereorgs, l.pagereorgs - f.pagereorgs as diff
from lastsnaps l, firstsnaps f where l.tabschema = f.tabschema and l.tabname = f.tabname
with early (starttime) as
(select min(snap_time) from tabsnap where time(snap_time) > '09:00'), late (endtime) as
(select min(snap_time) from tabsnap where time(snap_time) > '15:00'),
firstsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select starttime from early)), lastsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select endtime from late))
select l.tabschema, l.tabname, l.pagereorgs, f.pagereorgs, l.pagereorgs -f.pagereorgs as diff
from lastsnaps l, firstsnaps f where l.tabschema = f.tabschema and l.tabname = f.tabname
;
43
44
Example 2 – Performance Issue
with early (starttime) as (select min(snap_time) from tabsnap where time(snap_time) > '09:00'),
late (endtime) as (select min(snap_time) from tabsnap where time(snap_time) > '15:00'),
firstsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select starttime from early)), lastsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs from tabsnap where snap_time = (select endtime from late)) select l.tabschema, l.tabname, l.pagereorgs, f.pagereorgs,
l.pagereorgs - f.pagereorgs as diff from lastsnaps l, firstsnaps f
where l.tabschema = f.tabschema and l.tabname = f.tabname
This is just the query in one go
with early (starttime) as
(select min(snap_time) from tabsnap where time(snap_time) > '09:00'), late (endtime) as
(select min(snap_time) from tabsnap where time(snap_time) > '15:00'),
firstsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select starttime from early)), lastsnaps (tabschema, tabname, snap_time, pagereorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time = (select endtime from late))
select l.tabschema, l.tabname, l.pagereorgs, f.pagereorgs, l.pagereorgs -f.pagereorgs as diff
from lastsnaps l, firstsnaps f where l.tabschema = f.tabschema and l.tabname = f.tabname
;
44
45
Example 2 – Performance Issue
• That’s OK as far as it goes
• What if I want more time increments?
• Or to find the worst hour?
• That becomes more interesting
45
46
Example 2 – Performance Issue
• Once again, OLAP SQL to the rescue
• The principle I always use it to build it up slowly
• The more complex the piece of SQL, the more steps I will take
• Debugging a 40 line piece of SQL is tricky
46
47
Example 2 – Performance Issue
• Step 1: Get the timestamps
with wantedtimes (timeareas) as ( values (time('09:00'))
union all
select timeareas + 1 hour from wantedtimes where timeareas < time('15:00')
),
realtimes (approx_time, time_stamps) as (select timeareas, min(snap_time)
from wantedtimes w, tabsnap t
where w.timeareas < time(t.snap_time) group by timeareas)
Wantedtimes stores every hour from 0900 to 1500.
If I added 30 minutes I would get every half an hour.
Do not go more granular than your snapshots. So if you take snapshots every 30 minutes do not do this query for every 15 minutes
Realtimes stores that actual snapshot time for the first snapshot after the time in wantedtimes
47
48
Example 2 – Performance Issue
• Step 2: Get the data
snap_data (tabschema, tabname, snap_time, page_reorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap
where snap_time in (select time_stamps from realtimes) )
This gets all the snapshot data for the timestamps in realtimes
48
49
Example 2 – Performance Issue
• Step 2: Now get the data with increments
inc_data (tabschema, tabname, snap_time, page_reorgs, diff, totdiff) as
(select tabschema, tabname, snap_time, page_reorgs, page_reorgs - sum(page_reorgs) over
(partition by tabschema, tabname order by tabschema, tabname, snap_time rows between 1 preceding and 1 preceding),
page_reorgs - min(page_reorgs) over (partition by tabschema, tabname) from snap_data )
This now keeps the same data, but adds in the increment of page reorgs.
The partition by and order by make sure it is the same table
The rows between 1 preceding and 1 preceding get the data from the previous row – be careful of going unbounded as it can take a long time especially if it has to get every rows back first
49
50
Example 2 – Performance Issue
• That’s the base data
• I can make that a table or a view or just put it into statements
• Everything else comes from that
(tabschema, tabname, snap_time, pagereorgs, diff, tot_diff) as with wantedtimes (timeareas) as
( values (time('09:00')) union all
select timeareas + 1 hour from wantedtimes where timeareas < time('15:00')
),
realtimes (approx_time, time_stamps) as (select timeareas, min(snap_time)
from wantedtimes w, tabsnap t
where w.timeareas < time(t.snap_time) group by timeareas),
snap_data (tabschema, tabname, snap_time, page_reorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap
where snap_time in (select time_stamps from realtimes) ) select tabschema, tabname, snap_time, page_reorgs,
page_reorgs - sum(page_reorgs) over
(partition by tabschema, tabname order by tabschema, tabname, snap_time rows between 1 preceding and 1 preceding),
page_reorgs - min(page_reorgs) over (partition by tabschema, tabname) from snap_data
50
51
Example 2 – Performance Issue
• What’s the worst hour?
select snap_time, sum(diff) from inc_data
group by snap_time
51
52
Example 2 – Performance Issue
• What’s the worst table?
select tabschema, tabname, totdiff from inc_data
order by totdiff desc fetch first 1 row only
52
53
Example 2 – Performance Issue
• How bad were the tables in the worst hour?
worsthours (worst_snap_time, tdiff) as
(select snap_time, coalesce(sum(diff),0) as totdiff from inc_data
group by snap_time order by totdiff desc fetch first 1 row only )
select tabschema, tabname, worst_snap_time, page_reorgs, diff from inc_data, worsthours
where snap_time = worst_snap_time
This is a combination of the previous two examples.
Worsthours gets me the worsthour and then I get all the information for that hour
53
54
Example 2 – Performance Issue
• The worst 10 tables in the worst hour?
worsthours (worst_snap_time, tdiff) as
(select snap_time, coalesce(sum(diff),0) as totdiff from inc_data group by snap_time
order by totdiff desc fetch first 1 row only )
select tabschema, tabname, worst_snap_time, page_reorgs, diff from inc_data, worsthours
where snap_time = worst_snap_time and diff is not null order by diff desc
fetch first 10 rows only
with wantedtimes (timeareas) as ( values (time('09:00'))
union all
select timeareas + 1 hour from wantedtimes where timeareas < time('15:00')
),
realtimes (approx_time, time_stamps) as (select timeareas, min(snap_time)
from wantedtimes w, tabsnap t
where w.timeareas < time(t.snap_time) group by timeareas),
snap_data (tabschema, tabname, snap_time, page_reorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap
where snap_time in (select time_stamps from realtimes) ),
inc_data (tabschema, tabname, snap_time, page_reorgs, diff, totdiff) as (select tabschema, tabname, snap_time, page_reorgs,
page_reorgs - sum(page_reorgs) over
(partition by tabschema, tabname order by tabschema, tabname, snap_time rows between 1 preceding and 1 preceding),
i ( )
54
55
Example 2 – Performance Issue
• Was it important to look at the increments?
• Yes
• There was a batch job that did a lot of work on 1 table
• This had 40K + page reorgs
• 95%+ were before 7am – they weren’t the problem
• Top six tables were quickly highlighted as the key tables
with wantedtimes (timeareas) as ( values (time('09:00'))
union all
select timeareas + intervals from wantedtimes, intervals where timeareas < time('15:00') ), realtimes (approx_time, time_stamps) as (select timeareas, min(snap_time)
from wantedtimes w, tabsnap t
where w.timeareas < time(t.snap_time) group by timeareas),
snap_data (tabschema, tabname, snap_time, page_reorgs) as (select table_schema, table_name, snap_time, page_reorgs
from tabsnap where snap_time in (select time_stamps from realtimes) ), ltddata (tabschema, tabname, snap_time, page_reorgs, diff, fulldiff) as (select tabschema, tabname, snap_time, page_reorgs,
page_reorgs - sum(page_reorgs) over
(partition by tabschema, tabname order by tabschema, tabname, snap_time rows between 1 preceding and 1 preceding),
page_reorgs - min(page_reorgs) over (partition by tabschema, tabname) from snap_data),
worsthours (worst_snap_time, tdiff) as
(select snap_time, coalesce(sum(diff),0) as totdiff from ltddata group by snap_time
order by totdiff desc fetch first 1 row only )
select tabschema, tabname, worst_snap_time, page_reorgs, diff, fulldiff 55
56
AGENDA
• What is a performance warehouse?
• How do I justify a performance warehouse?
• What are the priorities?
• Examples of information
• What then?
56
57