Bangalore, India August 6 - 9, 2007
IBM
WEBSPHERE
TECHNICAL CONFERENCE
Session Number: W02
Session Number: W02
Tuning the Java Virtual Machine for Optimal Performance:
Tuning the Java Virtual Machine for Optimal Performance:
Means and Methods
Means and Methods
Rajeev
Rajeev
Palanki
Palanki
IBM Java Technology Center
IBM Java Technology Center
[email protected]
[email protected]
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 2
Objectives
Objectives
•
Have an insight into key aspects of JVM Runtime Performance and
understanding of the means and methods to tune the JVM for optimal
performance.
•
At the end of this session you should have:
High level overview of the Java Virtual Machine (JVM) and its key components.
Understanding of different Garbage Collection schemas in the Sovereign & J9 Virtual Machines and
their impact on JVM Runtime performance.
Knowledge about using verbosegc outputs effectively to improve application response times.
Introduction to Shared Classes Technology and performance gains.
Agenda
Agenda
•
Overview of the Java Virtual Machine and its key components
•
Garbage Collection Basics (Sovereign VM – 142 JDK)
•
Profiling Garbage Collection: Verbosegc Outputs
•
Garbage Collection Policies in Java 5.0
•
Debugging and Analysis tools for Garbage Collection
•
Introduction to Shared Classes Technology
•
Real Time Java – A brief overview
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 4
IBM Java Building Blocks
IBM Java Building Blocks
Java SDK
Virtual Machine
Class Libraries
JIT
IBM Java Building Blocks
IBM Java Building Blocks
Java SDK
Virtual Machine
Class Libraries
JIT
ORB
XML
Security
Big decimal
RAS
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 6
The Java Application Stack
The Java Application Stack
Java Code
Platform
Native
libraries
Java Class Libraries
ORB
Java Class Extensions
Native Code
Execution Management (XM)
Core Interface (CI)
Diagnostics
Execution
Engine
Classloader
Lock
Data
Conversion
Storage
Building Blocks
Building Blocks
The JDK is a key component in the Application Server Stack from a performance perspective
Operating system
–
Vendor specific operating environment
–
Specific hardware architecture (instruction set)
Java SDK
Virtual Machine
Class Libraries
JIT
Application
–
“Write Once Run Anywhere”
–
Class Libraries
– Collection of well-defined code packages that assist developers’ creation of business applications (3 specifications)
–
Just-in-Time Compiler
– Code generator that converts bytecodes into machine language instructions at run time.
–
Virtual Machine
– Platform independent execution environment that abstracts operating system specifics from the developer/user.
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 8
IBM Java 5.0
IBM Java 5.0
•
IBM have totally rewritten and redesigned our VM for Java 5.0 – You may have
heard this referred as J9
New Virtual Machine implementation (J9)
New Garbage Collection Mechanism (Modron)
New Just In Time Compiler (Testarossa)
Shared Classes technology
•
Just In Time Compiler (Testarossa)
Multiple optimization levels
Recompilation driven by sampling thread
Dynamic Profile Information Collection
Profiling thread helps determine “hotness” of methods
Asynchronous compilation
•
Garbage Collection (Modron)
Uses a “type accurate” collector
Introduces parallel compactor
Garbage Collection Overview
Garbage Collection Overview
¾
Garbage Collection (GC)
The main cause of memory–related performance bottlenecks in Java.
¾
Two things to look at in GC: frequency and duration
Frequency depends on the heap size and allocation rate
Duration
depends on the heap size and number of objects in the heap
¾
GC algorithm
Critical to understand how it works so that tuning is done more intelligently
.
¾
How do you eliminate GC bottlenecks?
Minimize the use of objects by following good programming practices
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 10
The (IBM) JVM Garbage Collector
•
“The purpose of Garbage Collection is to identify Java Heap storage which is no
longer being used by the Java application and so is available for reuse by the
JVM”
•
Key questions:
•
Performance and Scalability: How quickly can you find garbage?
•
Accuracy:
Can you find all the garbage?
Garbage
Garbage
Collection
Collection
:
:
IBM Technology
IBM Technology
•
Concurrent mark
Most of the marking phase is done outside of ‘Stop the World’ when the ‘mutator’ threads are
still active giving a significant improvement in pause time.
•
Parallelizing the garbage Collection phases
The Mark and sweep workload is distributed across available processors resulting in a
significant improvement in pause times
•
Adaptive sizing of thread local heaps
Reduces the amount of Java Heap locking
•
Incremental compaction
The expense of compaction is distributed across GCs leading to a reduction in (an occasional)
long pause time
.
•
Java 5 technologies
2007 WebSphere Technical Conference (Bangalore, India) © 2007 IBM Corporation 12
The JVM Heaps
The JVM Heaps
Size Next Size Nextfreelist
Null
free storage free storageNative Heap
Java Heap
‘
Thread Stacks
Buffers
JIT Compiled Code
Motif structures
Free List
Allocation schemes
Allocation schemes
•
Two types of allocation
–
Cache Allocation
(for object allocations < 512 bytes), does not require Heap
Lock. Each thread has local storage in the heap (TLH – Thread Local Heap)
where the objects are allocated.
–
Heap Lock Allocation
(Heap Allocation occurs when the allocation request is
more than 512 bytes, requires Heap Lock.
If size is less than 512 or enough space in the cache
try cacheAlloc
return if OK
HEAP_LOCK
do forever
If there is a big enough chunk on freelist
Take it
goto Gotit
else
manageAllocFailure
If any error
goto GetOut
End do
Gotit:
Initialize object
Get out
Heap_UNLOCK
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 14
Large Object Allocation
Large Object Allocation
•
All objects => 64K are termed “large” from the VM
perspective
•
In practice, objects of 10MB+ in size are usually
considered large
•
The Large Object Area is 5% of the active heap by
default.
•
Any object is first tried to be allocated in the free list of
the main heap – if there is not enough “contiguous”
space in the main heap to satisfy the allocation request
for object => 64K, then it is allocated in the Large
Object Area (wilderness)
•
Objects < 64K can only be allocated in the main heap
and never in the Large Object Area
Active heap
LOA
Xmx
Users can identify the Java stack of a thread making an allocation request of larger than the value specified with the environment variable ALLOCATION_THRESHOLD
export ALLOCATION_THRESHOLD =5400
This will give java stacks for object allocations of created than 5400 bytes
.
Users can specify the desired % of the Large Object Area using the Xloration option (where n determines the
fraction of heap designated for LOA.
Sub pools
Sub pools
•
Subpools provide an improved policy of object allocation and is available
from JDK 1.4.1 releases only on AIX.
–
Improved time for allocating objects
–
Avoid premature GCs due to allocation of large objects
–
Improve MP scalability by reducing time under HEAP_LOCK
–Optimize TLH sizes and storage utilization
•
The subpool algorithm uses multiple free lists rather than the single free list
used by the default allocation scheme.
•
It tries to predict the size of future allocation requests based on earlier
allocation requests. It recreates free lists at the end of each GC based on
these predictions.
•
While allocating objects on the heap, free chunks are chosen using a ″best
fit″ method, as against the ″first fit″ method used in other algorithms.
•
It is enabled by the –Xgcpolicy:subpool option.
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 16
Garbage Collection Basics (142 JVM)
Garbage Collection Basics (142 JVM)
•
Garbage Collection is performed when there is:
¾
An allocation failure in the heap lock allocation
¾
Specific call to System.gc
•
Garbage Collection is Stop the World (All other application threads are suspended during
GC)
•
Two main technologies used to remove garbage:
¾
Mark Sweep Collector
¾
Copy Collector
GC occurs in the thread that handled the request
¾
Requested object allocation that caused allocation failure
¾
Programmatically requested GC
•
Thread must acquire certain locks required for GC
¾
Heap Lock
¾
Thread queue lock
Object reclamation process for a Mark Sweep Collector
Object reclamation process for a Mark Sweep Collector
Obtain locks and suspend threads
Mark phase
Process of identifying all objects reachable from the root set.
All “live” objects are marked by setting a mark bit in the mark bit vector.
Sweep phase
¾
Sweep phase identifies all the objects that have been allocated, but no longer
referenced
.
Compaction (optional)
¾
Once garbage has been removed, we consider compacting the resulting set of
objects to remove spaces between them.
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 18
Mark
Mark
–
–
Sweep
Sweep
–
–
Compact Algorithm
Compact Algorithm
Root Set
Heap reachable object unreachable objectTogether we achieve: Some parallel processing
Together we achieve: Some parallel processing
•
GC Helper threads
On a multiprocessor system with N CPUs, a JVM supporting parallel mode starts N-1
garbage collection helper threads at the time of initialization.
These threads remain idle at all times when the application code is running; they are
called into play only when garbage collection is active.
For a particular phase, work is divided between the thread driving the garbage
collection and the helper threads, making a total of N threads running in parallel on
an N-CPU machine.
The only way to disable the parallel mode is to use the -Xgcthreads parameter to
change the number of garbage collection helper threads being started
.
•
Parallel Mark
The basic idea is to augment object marking through the addition of helper threads
and a facility for sharing work between them.
•
Parallel BitWise Sweep
Similar to parallel mark, uses same helper threads as parallel mark
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 20
Concurrent Marking
Concurrent Marking
Designed to give reduced and consistent GC pause times as heap sizes
increases.
Concurrent aims to complete the marking just before the before the heap is
full.
In the concurrent phase, the Garbage Collector scans the roots by asking each
thread to scan its own stack. These roots are then used to trace live objects
concurrently.
Tracing is done by a low-priority background thread and by each application
thread when it does a heap lock allocation.
Incremental Compaction
Incremental Compaction
•
Incremental compaction removes the dark matter from the heap and reduces pause times
significantly
•
The fundamental idea behind incremental compaction is to split the heap up into sections and
compact each section just as during a full compaction.
•
Incremental compaction was introduced in JDK 1.4.0; is enabled by default and triggered under
particular conditions. (Called Reasons)
•
-Xpartialcompactgc
Option to invoke incremental compaction in every GC cycle.
•
-Xnopartialcompactgc
Option to disable incremental compcation.
•
-Xnocompactgc
Option to disable full compcation
The heap is divided into regions
The heap is divided into regions
The regions are further divided into sections
The regions are further divided into sections
Each section is handled by one helper thread
Each section is handled by one helper thread
A region is divided into
A region is divided into
(number of helper threads +1) or
(number of helper threads +1) or
8 sections (whichever is less)
8 sections (whichever is less)
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 22
Explicit Garbage Collection
Explicit Garbage Collection
Garbage collector is called only upon two conditions:
When an allocation failure occurs
GC explicitly called using System.gc
Don’t call System.gc() at all. It hurts more often than it helps. GC
knows when it should run
The temptation to scatter System.gc() calls here, there, and
everywhere is enormous. It does not make a good idea.
TRUST ME !!!!!
Profiling Garbage Collection:
Profiling Garbage Collection:
Verbosegc
Verbosegc
output
output
The most indispensable tool for profiling GC activity is
Verbosegc – from JVM runtime.
Enabled using –verbosegc on the java command line.
Verbosegc redirection
-Xverbosegclog: <path to file> filename
Verbosegc redirection to multiple files
-Xverbosegclog:<path to file>filename#,X,Y
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 24
Understanding a typical
Understanding a typical
verbosegc
verbosegc
output
output
<AF[71]:
Allocation Failure
. need 65552 bytes,
3
ms since last AF>
<AF[71]: managing allocation failure,
action=2
(
142696/10484224
)>
<GC(71): GC cycle started Fri Mar 19 17:59:06 2004
<GC(71): freed 94184 bytes, 2% free (
236880/10484224
), in 12 ms>
<GC(71): mark: 5 ms, sweep: 0 ms, compact: 7 ms>
<GC(71): refs:
soft 0
(age >= 32), weak 0, final 0, phantom 0>
<GC(71):
moved 3095 objects
, 188552 bytes, reason=1>
<AF[71]: managing allocation failure, action=3 (236880/10484224)>
<AF[71]: managing allocation failure, action=4 (236880/10484224)>
<AF[71]: managing allocation failure, action=6 (236880/10484224)>
JVMDG217: Dump Handler is Processing a Signal - Please Wait.
JVMDG315: JVM Requesting Heap dump file
JVMDG318: Heap dump file written to
/workarea/rajeev/gctests/heapdump.20040319.175906.8467.txt
JVMDG303: JVM Requesting Java core file
JVMDG304: Java core file written to
/workarea/rajeev/gctests/javacore.20040319.175906.8467.txt
JVMDG274: Dump Handler has Processed OutOfMemory.
<AF[71]:
insufficient heap space to satisfy allocation request
>
<AF[71]: completed in
203 ms
>
When are GC messages printed out?
When are GC messages printed out?
•
The first two lines are put out just before the beginning of STW phase of GC.
•
Rest of messages are printed out after the STW phase ends and threads are
woken up. No messages are printed during GC.
•
Heap shrinkage messages are printed before STW messages, but shrinkage
happens AFTER STW phase!
•
Heap expansion messages are correctly printed AFTER STW messages.
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 26
Things to look for in a
Things to look for in a
verbosegc
verbosegc
output
output
•
Was it an Allocation Failure GC?
•
What was the size of allocation request that caused AF?
•
What were the total and free heap sizes before GC?
•
What was the total pause time?
•
Where was maximum time spent in GC?
•
Are we doing a compaction in each GC cycle?
•
What actions were taken by GC?
•
Was GC able to meet allocation request in the end?
GC actions (JDK 142)
GC actions (JDK 142)
•
Look for lines of this type:
managing allocation failure, action=<n>
Where <n> is the numerical value of action taken.
Actions:
0 -> GC because of exhaustion of pinned free list.
1 -> Perform garbage collection without using
wilderness
2 -> Garbage Collector tried to allocate out of wilderness and
failed.
3 -> Expand the Java heap
4 -> Clear soft references
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 28
Using
Using
verbosegc
verbosegc
to set the heap size
to set the heap size
•
Use verbosegc to guess ideal size of heap, and then tune using –Xmx and –Xms.
•
Setting –Xms:
Should be big enough to avoid AFs from the time the application starts to the time it
becomes ‘ready’. (Should not be any bigger!)
•
Setting –Xmx:
In the normal load condition, free heap space after each GC should be > minf (Default
is 30%).
There should not be any OutOfMemory errors.
In heaviest load condition, if free heap space after each GC is > maxf (Default is 70%),
heap size is too big.
Example of
Example of
verbosegc
verbosegc
when heap is too small
when heap is too small
GC is too frequent
<AF[25]: Allocation Failure. need 65552 bytes,
1
ms since last AF>
<AF[25]: managing allocation failure, action=2 (319456/
10484224
)>
<GC(25): GC cycle started Sat Mar 20 15:32:50 2004
<GC(25): freed 3968 bytes, 3% free (323424/10484224), in 11 ms>
<GC(25): mark: 5 ms, sweep: 0 ms, compact: 6 ms>
<GC(25): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
<GC(25): moved 214 objects, 9352 bytes, reason=1>
<AF[25]: managing allocation failure, action=3 (323424/10484224)>
<AF[25]: managing allocation failure, action=4 (323424/10484224)>
<AF[25]: managing allocation failure, action=6 (323424/10484224)>
<AF[25]: warning! free memory getting short(1). (323424/10484224)>
<AF[25]: completed in 13 ms>
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 30
Example of
Example of
verbosegc
verbosegc
when heap is too big
when heap is too big
GC is too long
<AF[29]: Allocation Failure. need 2321688 bytes,
88925
ms since last AF>
<AF[29]: managing allocation failure, action=1 (
3235443800/20968372736
)
(3145728/3145728)>
<GC(29): GC cycle started Mon Nov 4 14:46:20 2002
<GC(29): freed 8838057824 bytes, 57% free (12076647352/20971518464), in
4749 ms>
<GC(29): mark: 4240 ms, sweep: 509 ms, compact: 0 ms>
<GC(29): refs: soft 0 (age >= 32), weak 0, final 1, phantom 0>
<AF[29]: completed in
4763
ms>
Effect of wrong
Effect of wrong
–
–
Xms
Xms
&
&
-
-
Xmx
Xmx
settings
settings
Too small heap = Too frequent GC.
Too big heap = Too much GC pause time. (Irrespective of amount of physical memory on the
system)
Heap size > physical memory size = paging/swapping = bad for your application.
It is desirable to have the Xms much less than Xmx if you are encountering fragmentation issues.
This forces class allocations, thread and persistent objects to be allocated at the bottom of the
heap.
What about Xms=Xmx?
It means no heap expansion or shrinkage will ever occur.
Not normally recommended.
It may be good for a few apps which require constant high heap storage space.
Hurts apps which show a varying load.
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 32
Mark Stack Overflow (MSO)
Mark Stack Overflow (MSO)
•
Verbosegc will contain the line:
<GC(45): mark stack overflow>
•
Is bad for performance.
•
Caused by too many objects on the heap, especially deeply nested objects.
•
Processing MSOs is expensive
•
No solution is a silver bullet. Things to try:
¾
Decrease the heap size!
¾
Use concurrent mark (-Xgcpolicy:optavgpause)
¾
Re-design the application.
High pause times and system activity
High pause times and system activity
In the event of pause times being usually acceptable with the exception of a few
"abnormally high"
spikes - we are likely to infer that the deviation was a result of
some system level activity (heavy paging for ex) outside of the Java process.
Consideration:
How many clock ticks our process actually spent executing
instructions, not time spent waiting for I/O or time spent waiting for a CPU to
become available for the process to run on?
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 34
Headline Changes in Java 5.0 Garbage Collector
Headline Changes in Java 5.0 Garbage Collector
Java Tiger: JSE 5.0
¾
5.0 uses a completely new memory management framework
¾
No pinned/dosed objects
Stack Maps used to provide a level of indirection between references and heap
5.0 VM never pins arrays, it always makes a copy for the JNI code
¾
The GC is Type Accurate
¾
New efficient parallel compactor
Garbage collection policies in IBM Java 5.0
Garbage collection policies in IBM Java 5.0
•
Four policies available:
Optthruput (default)
¾
Mark-sweep algorithm
¾
Fastest for many workloads
optavgpause
¾
Concurrent collection and concurrent sweep
¾
Small mean pause
¾
Throughput impact
gencon
¾
The Generational Hypothesis
¾
Fastest for transactional workloads
¾
Combines low pause times and high throughput
subpools
¾
Mark-sweep based, but with multiple freelists
¾
Avoids allocation contention on many-processor machines
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 36
Why different GC Policies?
Why different GC Policies?
•
Availability of different GC policies gives you increased capabilities.
•
Best choice depends upon application behaviour and workloads.
•
Think about throughput, response times & pause times.
Throughput
is the amount of
data processed by the
application
Pause time
is the amount
of time the garbage
collector has stopped
threads while collecting the
heap.
Response time
is the latency
of the application – how
quickly it answers incoming
requests
Policy
Option
( -Xgcpolicy )
Description
Optimize for
throughput
Optthruput
(Default)
It is typically used for applications where
raw throughput is more important than
short GC pauses. The application is
stopped each time that garbage is
collected.
Optimize for
pause times
Optavgpause
Trades high throughput for shorter GC
pauses by performing some of the
garbage collection concurrently. The
application is paused for shorter times.
Generational
Concurrent
gencon
Handles short lived objects differently than
the longer lived.
Supool
subpool
Uses same algorithm similar to the default
policy but employs allocation strategy
suitable for SMPs.
Runtime Performance Tuning
Runtime Performance Tuning
What policy should I choose for my J9 VM
What policy should I choose for my J9 VM
Policy
Considerations
optthruput
I want my application to run to completion as quickly as possible.
optavgpause
•
My application requires good response times to unpredictable events.
•
A degradation in performance is acceptable as long as GC pause times are reduced.
•
My application is running very large java heaps.
•
My application is a GUI application and user response times are important.
gencon
•
My application allocates many short lived objects
•
The java heap space is fragmented
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 38
Memory Management / Garbage Collection
Memory Management / Garbage Collection
How the IBM J9 Generational Garbage Collector Works
How the IBM J9 Generational Garbage Collector Works
JVM Heap
Nursery/Young Generation
Old Generation
Permanent Space
Sun JVM Only:
-XX:MaxPermSize=nn
IBM J9:
-Xmn (-Xmns/-Xmnx)
Sun:
-XX:NewSize=nn
-XX:MaxNewSize=nn
-Xmn<size>
IBM J9:
-Xmo (-Xmos/-Xmox)
Sun:
-XX:NewRatio=n
• Minor Collection – takes place only in the young generation, normally
done through direct copying Æ very efficient
• Major Collection – takes place in the new and old generation and uses
the normal mark/sweep (+compact) algorithm
Nursery/Young Generation
Nursery/Young Generation
Nursery/Young Generation
Allocate Space
Survivor Space
Survivor Space
Allocate Space
•
Nursery is split into two spaces (semi-spaces)
Only one contains live objects and is available for allocation
Minor collections (Scavenges) move objects between spaces
Role of spaces is reversed
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 40
Sample
Sample
verbosegc
verbosegc
output for
output for
gencon
gencon
<af type="nursery" id="35" timestamp="Thu Aug 11 21:47:11 2005" intervalms="10730.361"> <minimum requested_bytes="144" />
<time exclusiveaccessms="1.193" />
<nursery freebytes="0" totalbytes="1226833920" percent="0" />
<tenured freebytes="68687704" totalbytes="209715200" percent="32" > <soa freebytes="58201944" totalbytes="199229440" percent="29" /> <loa freebytes="10485760" totalbytes="10485760" percent="100" /> </tenured>
<gc type="scavenger" id="35" totalid="35" intervalms="10731.054"> <flipped objectcount="1059594" bytes="56898904" />
<tenured objectcount="12580" bytes="677620" /> <refs_cleared soft="0" weak="691" phantom="39" /> <finalization objectsqueued="1216" />
<scavenger tiltratio="90" />
<nursery freebytes="1167543760" totalbytes="1226833920" percent="95" tenureage="14" /> <tenured freebytes="67508056" totalbytes="209715200" percent="32" >
<soa freebytes="57022296" totalbytes="199229440" percent="28" /> <loa freebytes="10485760" totalbytes="10485760" percent="100" /> </tenured>
<time totalms="368.309" /> </gc>
<nursery freebytes="1167541712" totalbytes="1226833920" percent="95" /> <tenured freebytes="67508056" totalbytes="209715200" percent="32" >
<soa freebytes="57022296" totalbytes="199229440" percent="28" /> <loa freebytes="10485760" totalbytes="10485760" percent="100" /> </tenured>
<time totalms="377.634" /> </af>
Allocation request
details, time it took to
stop all mutator threads.
Heap occupancy
details before GC.
Heap occupancy
details after GC.
Details about the
scavenge.
Diagnostic tool for garbage collector
Diagnostic tool for garbage collector
•
Diagnostic tool for optimizing parameters affecting the
garbage collector while using IBM JVM
•
Reads the “verbosegc” output and produces textual and
graphical visualizations and related statistics
–
Frequency of garbage collection cycles
–
Time spent in different phases of garbage collection
–
Quantities of heap memories involved in the
process
–
Characteristics of allocation failures
–
Mark Stack Overflows
•
Built in parsers for JVM 1.5, 1.4.2 , 1.3.1 & 1.2.2
•
Prerequisite – JFreeChart libraries (freely downloadable)
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 42
Screen Shots
Screen Shots
Starting the tool and selecting verbosegc input file and
appropriate JVM parser
Extract GCCollector.zip
Place jfreeChart-0.9.21.jar and jcommon-0.9.6.jar in the lib directory of the GCCollector folder
Execute GCCollector.bat (which will spawn a GUI)
Select verbosegc file for analysis
Duration of GC Cycles
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 44
Graphical view of heap usage
Information for specific cycle
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 46
Choosing time range
Extensible Verbose Tool Kit
Extensible Verbose Tool Kit
Analyzing your verbose GC output
Analyzing your verbose GC output
•
EVTK: Verbose GC visualizer and analyzer
Available through ISA
¾
IBM Support Assistant v3.0.2
https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=isa
Reduces barrier to understanding verbose output
¾
Visualize GC data to track trends and relationships
¾
Analyze results and provide general feedback
¾
Extend to consume output related to application
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 48
Starting the EVTK
Starting the EVTK
In the Tools
section of ISA,
select the Java
product plug-in to
display the
available tools
Click on the
name of a tool to
start that tool
EVTK usage scenarios
EVTK usage scenarios
•
Investigate performance problems
Long periods of pausing or unresponsiveness
•
Evaluate your heap size
Check heap occupancy and adjust heap size if needed
•
Garbage collection policy tuning
Examine GC characteristics, compare different policies
•
Look for memory growth
Heap consumption slowly increasing over time
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 50
Extensible Verbose Toolkit overview
Extensible Verbose Toolkit overview
•
The Extensible Verbose Toolkit (EVTK) is a visualizer for verbose
garbage collection output
The tool parses and plots verbose GC output and garbage collection traces (
-Xtgc
output)
•
The tooling framework is extensible, and will be expanded over time to
include visualization for other collections of data
•
The EVTK provides
Raw view of data
Line plots to visualize a variety of GC data characteristics
Tabulated reports with heap occupancy recommendations
View of multiple datasets on a single set of axes
Plotting data with the EVTK
Plotting data with the EVTK
Use File > Open
to open a new
input file
The VGC Data
menu allows
you to choose
what data to
display
The Line plot tab contains
the data visualization
Use File > Add
to add multiple
input files to a
single data set
for comparison
and aggregated
display
The Axes
panel supports
customized
units and
pan-and-zoom
Right-click
on the
plot and use the
context menu
to
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 52
Reports and recommendations
Reports and recommendations
•
Report contents can be
configured using VGC menu
options
Occupancy recommendations
tell you how to adjust heap size
for better performance
Summary information is
generated for each input in the
dataset
Graphs included for all GC
display data
•
Can export as HTML by
right-clicking and using the context
menu
The Report tab contains the
Types of graphs
Types of graphs
•
The EVTK has built-in support for over forty different types of graphs
These are configured in the VGC Data menu
Options vary depending on the current dataset and the parsers and
post-processors that are enabled
•
Some of the VGC graph types are:
•
Note: Different graph types and a different menu are available for TGC
output
•
Free tenured heap (after collection)
•
Tenured heap size
•
Tenure age
•
Free LOA (after collection)
•
Free SOA (after collection)
•
Total LOA
•
Total SOA
•
Used total heap
•
Pause times (mark-sweep-compact
collections)
•
Pause times (totals, including exclusive
access)
•
Compact times
•
Weak references cleared
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 54
Heap usage and occupancy
Heap usage and occupancy
recommendation
recommendation
This graph shows heap
usage after garbage
collection; it jumps up to
around 60M and stays there
The summary report shows
that mean heap occupancy is
98% and that the application
is spending over a third of its
time doing garbage collection
EVTK
EVTK
–
–
Heap Visualization
Heap Visualization
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 56
EVTK
EVTK
-
-
Comparison & Advice
Comparison & Advice
Compare runs…
Java 5 Shared Classes
Java 5 Shared Classes
•
Available on all platforms.
•
Feature enabled using the –Xshareclasses flag.
•
Static class data caches in shared memory
Shared between all IBM Java VMs
All application and bootstrap classes shared
Cache persisted beyond lifetime of any JVM, but lost on shutdown/reboot
•
Provides savings in footprint and start up times.
•
Target: Server environments where multiple JVMs exist on the same box.
•
Multiple sharing strategies
Standard Classloaders (including Application Classloader) exploit this feature when enabled.
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 58
Shared Classes
Shared Classes
–
–
Start up times
Start up times
Sharing helps speed up JVM startup
Startup Improvements with Shared Classes and AOT
0
5
10
15
20
25
30
eclipse 3.2.2
tomcat 5.5
WAS 6.1
S
e
c
onds
default
-Xshareclasses (Java 5.0)" -Xshareclasses (Java 6.0)Lower is better
What does real
What does real
-
-
time mean?
time mean?
•
Real-time = predictability of performance
Hard - Violation of timing constraints are hard failures
Soft - Timing constraints are simply performance goals
•
Constraints vary in magnitude (microseconds to seconds)
•
Consequences of missing a timing constraint:
from service level agreement miss (stock trading)
to life in jeopardy (airplanes)
•
Real-fast is not real-time, but Real-slow is not real-good
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 60
WebSphere
WebSphere
Real Time (WRT)
Real Time (WRT)
•
1.0 generally available in August 2006.
•
WRT is a Java runtime providing highly predictable operation
Real-time garbage collection (Metronome)
Static and dynamic compilation
Full support for RTSJ (JSR 1)
Java SE 5.0 compliant
Real
Real
-
-
Time Garbage Collection
Time Garbage Collection
The Metronome Garbage Collector
The Metronome Garbage Collector
•
Utilization
Percentage of time dedicated to the application in a
given window of time
Application
Collector
Time
Application receives a minimum
percentage of time to run
10ms
•
Metronome uses a 10ms sliding window to
measure utilization
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 62
Visual Performance
Visual Performance
Analyser
Analyser
•
Eclipse based performance visualization kit
–
Profile Analyzer
–
Code Analyzer
–
Pipeline Analyzer
•
Profiler Analyzer provides a powerful set of graphical and text-based views that
allow users to narrow down performance problems to a particular process,
thread, module, symbol, offset, instruction or source line.
–
Supports time based system profiles
•
Code Analyzer examines executable files and displays detailed information
about functions, basic blocks and assembly instructions.
•
Pipeline Analyzer displays the pipeline execution of instruction traces.
Profile Analyzer Views
Profile Analyzer Views
•
The following are important views within Profile Analyzer:
Basic Blocks
Call-Graph Callers/Descendants
Compiler Listing
Console
Counters
Database Connections
Disassembly/Offsets
Disassembly Comparison
Java/Classes Hierarchy
Profile Comparison
Profile Details
Profile Resources
Resolved Calls
Sample Distribution Chart
Source Code
Symbol Distribution
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 64
Temporal Profiling View
Counters : In the Process Hierarchy View
Counters : In the Process Hierarchy View
(
(
View Window
View Window
-
-
> Show View
> Show View
-
-
> Others
> Others
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 66
Summary
Summary
•
Take Aways from the Session:
High level overview of the Java Virtual Machine (JVM) and its key
components.
Understanding of different Garbage Collection schemas in the Sovereign &
J9 Virtual Machines and their impact on JVM Runtime performance.
Knowledge about using verbosegc outputs effectively to improve application
response times.
Introduction to Shared Classes Technology and performance gains.
Familiarity with debugging and profiling tools available for JVM and their
effective usage
.
References & Further Reading
References & Further Reading
•
www.ibm.com/developerworks/java/library/j-ibmjava2
•
developers.sun.com/learning/javaoneonline/2007/pdf/TS-2023.pdf
•
http://www.ibm.com/developerworks/java/library/j-ibmjava4/
•
http://www-128.ibm.com/developerworks/java/library/j-rtj1/index.html
•
https://www-950.ibm.com/events/IBMImpact/Impact2007/3977.pdf
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 68
Questions ?
Merci
Gracias
Obrigado
Grazie
Danke
EnglishFrench
Russian Spanish Japanese German Italian Arabic Traditional Chinese Simplified Chinese Hindi ThaiThank You
Brazilian Portuguese Tamil KoreanTeşekkürler
2007 WebSphere Technical Conference (Bangalore, India)
© 2007 IBM Corporation 70