• No results found

Scribd is down for maintenance.

N/A
N/A
Protected

Academic year: 2021

Share "Scribd is down for maintenance."

Copied!
70
0
0

Loading.... (view fulltext now)

Full text

(1)

Bangalore, India August 6 - 9, 2007

IBM

WEBSPHERE

TECHNICAL CONFERENCE

Session Number: W02

Session Number: W02

Tuning the Java Virtual Machine for Optimal Performance:

Tuning the Java Virtual Machine for Optimal Performance:

Means and Methods

Means and Methods

Rajeev

Rajeev

Palanki

Palanki

IBM Java Technology Center

IBM Java Technology Center

[email protected]

[email protected]

(2)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 2

Objectives

Objectives

Have an insight into key aspects of JVM Runtime Performance and

understanding of the means and methods to tune the JVM for optimal

performance.

At the end of this session you should have:

ƒ

High level overview of the Java Virtual Machine (JVM) and its key components.

ƒ

Understanding of different Garbage Collection schemas in the Sovereign & J9 Virtual Machines and

their impact on JVM Runtime performance.

ƒ

Knowledge about using verbosegc outputs effectively to improve application response times.

ƒ

Introduction to Shared Classes Technology and performance gains.

(3)

Agenda

Agenda

Overview of the Java Virtual Machine and its key components

Garbage Collection Basics (Sovereign VM – 142 JDK)

Profiling Garbage Collection: Verbosegc Outputs

Garbage Collection Policies in Java 5.0

Debugging and Analysis tools for Garbage Collection

Introduction to Shared Classes Technology

Real Time Java – A brief overview

(4)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 4

IBM Java Building Blocks

IBM Java Building Blocks

Java SDK

Virtual Machine

Class Libraries

JIT

(5)

IBM Java Building Blocks

IBM Java Building Blocks

Java SDK

Virtual Machine

Class Libraries

JIT

ORB

XML

Security

Big decimal

RAS

(6)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 6

The Java Application Stack

The Java Application Stack

Java Code

Platform

Native

libraries

Java Class Libraries

ORB

Java Class Extensions

Native Code

Execution Management (XM)

Core Interface (CI)

Diagnostics

Execution

Engine

Classloader

Lock

Data

Conversion

Storage

(7)

Building Blocks

Building Blocks

The JDK is a key component in the Application Server Stack from a performance perspective

Operating system

Vendor specific operating environment

Specific hardware architecture (instruction set)

Java SDK

Virtual Machine

Class Libraries

JIT

Application

“Write Once Run Anywhere”

Class Libraries

Collection of well-defined code packages that assist developers’ creation of business applications (3 specifications)

Just-in-Time Compiler

Code generator that converts bytecodes into machine language instructions at run time.

Virtual Machine

Platform independent execution environment that abstracts operating system specifics from the developer/user.

(8)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 8

IBM Java 5.0

IBM Java 5.0

IBM have totally rewritten and redesigned our VM for Java 5.0 – You may have

heard this referred as J9

ƒ

New Virtual Machine implementation (J9)

ƒ

New Garbage Collection Mechanism (Modron)

ƒ

New Just In Time Compiler (Testarossa)

ƒ

Shared Classes technology

Just In Time Compiler (Testarossa)

ƒ

Multiple optimization levels

ƒ

Recompilation driven by sampling thread

ƒ

Dynamic Profile Information Collection

ƒ

Profiling thread helps determine “hotness” of methods

ƒ

Asynchronous compilation

Garbage Collection (Modron)

ƒ

Uses a “type accurate” collector

ƒ

Introduces parallel compactor

(9)

Garbage Collection Overview

Garbage Collection Overview

¾

Garbage Collection (GC)

™

The main cause of memory–related performance bottlenecks in Java.

¾

Two things to look at in GC: frequency and duration

™

Frequency depends on the heap size and allocation rate

™

Duration

depends on the heap size and number of objects in the heap

¾

GC algorithm

™

Critical to understand how it works so that tuning is done more intelligently

.

¾

How do you eliminate GC bottlenecks?

™

Minimize the use of objects by following good programming practices

(10)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 10

The (IBM) JVM Garbage Collector

“The purpose of Garbage Collection is to identify Java Heap storage which is no

longer being used by the Java application and so is available for reuse by the

JVM”

Key questions:

Performance and Scalability: How quickly can you find garbage?

Accuracy:

Can you find all the garbage?

(11)

Garbage

Garbage

Collection

Collection

:

:

IBM Technology

IBM Technology

Concurrent mark

Most of the marking phase is done outside of ‘Stop the World’ when the ‘mutator’ threads are

still active giving a significant improvement in pause time.

Parallelizing the garbage Collection phases

The Mark and sweep workload is distributed across available processors resulting in a

significant improvement in pause times

Adaptive sizing of thread local heaps

Reduces the amount of Java Heap locking

Incremental compaction

The expense of compaction is distributed across GCs leading to a reduction in (an occasional)

long pause time

.

Java 5 technologies

(12)

2007 WebSphere Technical Conference (Bangalore, India) © 2007 IBM Corporation 12

The JVM Heaps

The JVM Heaps

Size Next Size Next

freelist

Null

free storage free storage

Native Heap

Java Heap

Thread Stacks

Buffers

JIT Compiled Code

Motif structures

Free List

(13)

Allocation schemes

Allocation schemes

Two types of allocation

Cache Allocation

(for object allocations < 512 bytes), does not require Heap

Lock. Each thread has local storage in the heap (TLH – Thread Local Heap)

where the objects are allocated.

Heap Lock Allocation

(Heap Allocation occurs when the allocation request is

more than 512 bytes, requires Heap Lock.

If size is less than 512 or enough space in the cache

try cacheAlloc

return if OK

HEAP_LOCK

do forever

If there is a big enough chunk on freelist

Take it

goto Gotit

else

manageAllocFailure

If any error

goto GetOut

End do

Gotit:

Initialize object

Get out

Heap_UNLOCK

(14)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 14

Large Object Allocation

Large Object Allocation

All objects => 64K are termed “large” from the VM

perspective

In practice, objects of 10MB+ in size are usually

considered large

The Large Object Area is 5% of the active heap by

default.

Any object is first tried to be allocated in the free list of

the main heap – if there is not enough “contiguous”

space in the main heap to satisfy the allocation request

for object => 64K, then it is allocated in the Large

Object Area (wilderness)

Objects < 64K can only be allocated in the main heap

and never in the Large Object Area

Active heap

LOA

Xmx

Users can identify the Java stack of a thread making an allocation request of larger than the value specified with the environment variable ALLOCATION_THRESHOLD

export ALLOCATION_THRESHOLD =5400

This will give java stacks for object allocations of created than 5400 bytes

.

Users can specify the desired % of the Large Object Area using the Xloration option (where n determines the

fraction of heap designated for LOA.

(15)

Sub pools

Sub pools

Subpools provide an improved policy of object allocation and is available

from JDK 1.4.1 releases only on AIX.

Improved time for allocating objects

Avoid premature GCs due to allocation of large objects

Improve MP scalability by reducing time under HEAP_LOCK

Optimize TLH sizes and storage utilization

The subpool algorithm uses multiple free lists rather than the single free list

used by the default allocation scheme.

It tries to predict the size of future allocation requests based on earlier

allocation requests. It recreates free lists at the end of each GC based on

these predictions.

While allocating objects on the heap, free chunks are chosen using a ″best

fit″ method, as against the ″first fit″ method used in other algorithms.

It is enabled by the –Xgcpolicy:subpool option.

(16)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 16

Garbage Collection Basics (142 JVM)

Garbage Collection Basics (142 JVM)

Garbage Collection is performed when there is:

¾

An allocation failure in the heap lock allocation

¾

Specific call to System.gc

Garbage Collection is Stop the World (All other application threads are suspended during

GC)

Two main technologies used to remove garbage:

¾

Mark Sweep Collector

¾

Copy Collector

GC occurs in the thread that handled the request

¾

Requested object allocation that caused allocation failure

¾

Programmatically requested GC

Thread must acquire certain locks required for GC

¾

Heap Lock

¾

Thread queue lock

(17)

Object reclamation process for a Mark Sweep Collector

Object reclamation process for a Mark Sweep Collector

™

Obtain locks and suspend threads

™

Mark phase

ƒ

Process of identifying all objects reachable from the root set.

ƒ

All “live” objects are marked by setting a mark bit in the mark bit vector.

™

Sweep phase

¾

Sweep phase identifies all the objects that have been allocated, but no longer

referenced

.

™

Compaction (optional)

¾

Once garbage has been removed, we consider compacting the resulting set of

objects to remove spaces between them.

(18)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 18

Mark

Mark

Sweep

Sweep

Compact Algorithm

Compact Algorithm

Root Set

Heap reachable object unreachable object

(19)

Together we achieve: Some parallel processing

Together we achieve: Some parallel processing

GC Helper threads

™

On a multiprocessor system with N CPUs, a JVM supporting parallel mode starts N-1

garbage collection helper threads at the time of initialization.

™

These threads remain idle at all times when the application code is running; they are

called into play only when garbage collection is active.

™

For a particular phase, work is divided between the thread driving the garbage

collection and the helper threads, making a total of N threads running in parallel on

an N-CPU machine.

™

The only way to disable the parallel mode is to use the -Xgcthreads parameter to

change the number of garbage collection helper threads being started

.

Parallel Mark

™

The basic idea is to augment object marking through the addition of helper threads

and a facility for sharing work between them.

Parallel BitWise Sweep

™

Similar to parallel mark, uses same helper threads as parallel mark

(20)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 20

Concurrent Marking

Concurrent Marking

‰

Designed to give reduced and consistent GC pause times as heap sizes

increases.

‰

Concurrent aims to complete the marking just before the before the heap is

full.

‰

In the concurrent phase, the Garbage Collector scans the roots by asking each

thread to scan its own stack. These roots are then used to trace live objects

concurrently.

‰

Tracing is done by a low-priority background thread and by each application

thread when it does a heap lock allocation.

(21)

Incremental Compaction

Incremental Compaction

Incremental compaction removes the dark matter from the heap and reduces pause times

significantly

The fundamental idea behind incremental compaction is to split the heap up into sections and

compact each section just as during a full compaction.

Incremental compaction was introduced in JDK 1.4.0; is enabled by default and triggered under

particular conditions. (Called Reasons)

-Xpartialcompactgc

Option to invoke incremental compaction in every GC cycle.

-Xnopartialcompactgc

Option to disable incremental compcation.

-Xnocompactgc

Option to disable full compcation

‰

‰

The heap is divided into regions

The heap is divided into regions

‰

‰

The regions are further divided into sections

The regions are further divided into sections

‰

‰

Each section is handled by one helper thread

Each section is handled by one helper thread

‰

‰

A region is divided into

A region is divided into

™

™

(number of helper threads +1) or

(number of helper threads +1) or

™

™

8 sections (whichever is less)

8 sections (whichever is less)

‰

(22)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 22

Explicit Garbage Collection

Explicit Garbage Collection

‰

Garbage collector is called only upon two conditions:

™

When an allocation failure occurs

™

GC explicitly called using System.gc

‰

Don’t call System.gc() at all. It hurts more often than it helps. GC

knows when it should run

‰

The temptation to scatter System.gc() calls here, there, and

everywhere is enormous. It does not make a good idea.

TRUST ME !!!!!

(23)

Profiling Garbage Collection:

Profiling Garbage Collection:

Verbosegc

Verbosegc

output

output

The most indispensable tool for profiling GC activity is

Verbosegc – from JVM runtime.

Enabled using –verbosegc on the java command line.

Verbosegc redirection

-Xverbosegclog: <path to file> filename

Verbosegc redirection to multiple files

-Xverbosegclog:<path to file>filename#,X,Y

(24)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 24

Understanding a typical

Understanding a typical

verbosegc

verbosegc

output

output

<AF[71]:

Allocation Failure

. need 65552 bytes,

3

ms since last AF>

<AF[71]: managing allocation failure,

action=2

(

142696/10484224

)>

<GC(71): GC cycle started Fri Mar 19 17:59:06 2004

<GC(71): freed 94184 bytes, 2% free (

236880/10484224

), in 12 ms>

<GC(71): mark: 5 ms, sweep: 0 ms, compact: 7 ms>

<GC(71): refs:

soft 0

(age >= 32), weak 0, final 0, phantom 0>

<GC(71):

moved 3095 objects

, 188552 bytes, reason=1>

<AF[71]: managing allocation failure, action=3 (236880/10484224)>

<AF[71]: managing allocation failure, action=4 (236880/10484224)>

<AF[71]: managing allocation failure, action=6 (236880/10484224)>

JVMDG217: Dump Handler is Processing a Signal - Please Wait.

JVMDG315: JVM Requesting Heap dump file

JVMDG318: Heap dump file written to

/workarea/rajeev/gctests/heapdump.20040319.175906.8467.txt

JVMDG303: JVM Requesting Java core file

JVMDG304: Java core file written to

/workarea/rajeev/gctests/javacore.20040319.175906.8467.txt

JVMDG274: Dump Handler has Processed OutOfMemory.

<AF[71]:

insufficient heap space to satisfy allocation request

>

<AF[71]: completed in

203 ms

>

(25)

When are GC messages printed out?

When are GC messages printed out?

The first two lines are put out just before the beginning of STW phase of GC.

Rest of messages are printed out after the STW phase ends and threads are

woken up. No messages are printed during GC.

Heap shrinkage messages are printed before STW messages, but shrinkage

happens AFTER STW phase!

Heap expansion messages are correctly printed AFTER STW messages.

(26)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 26

Things to look for in a

Things to look for in a

verbosegc

verbosegc

output

output

Was it an Allocation Failure GC?

What was the size of allocation request that caused AF?

What were the total and free heap sizes before GC?

What was the total pause time?

Where was maximum time spent in GC?

Are we doing a compaction in each GC cycle?

What actions were taken by GC?

Was GC able to meet allocation request in the end?

(27)

GC actions (JDK 142)

GC actions (JDK 142)

Look for lines of this type:

managing allocation failure, action=<n>

Where <n> is the numerical value of action taken.

Actions:

0 -> GC because of exhaustion of pinned free list.

1 -> Perform garbage collection without using

wilderness

2 -> Garbage Collector tried to allocate out of wilderness and

failed.

3 -> Expand the Java heap

4 -> Clear soft references

(28)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 28

Using

Using

verbosegc

verbosegc

to set the heap size

to set the heap size

Use verbosegc to guess ideal size of heap, and then tune using –Xmx and –Xms.

Setting –Xms:

™

Should be big enough to avoid AFs from the time the application starts to the time it

becomes ‘ready’. (Should not be any bigger!)

Setting –Xmx:

™

In the normal load condition, free heap space after each GC should be > minf (Default

is 30%).

™

There should not be any OutOfMemory errors.

™

In heaviest load condition, if free heap space after each GC is > maxf (Default is 70%),

heap size is too big.

(29)

Example of

Example of

verbosegc

verbosegc

when heap is too small

when heap is too small

GC is too frequent

<AF[25]: Allocation Failure. need 65552 bytes,

1

ms since last AF>

<AF[25]: managing allocation failure, action=2 (319456/

10484224

)>

<GC(25): GC cycle started Sat Mar 20 15:32:50 2004

<GC(25): freed 3968 bytes, 3% free (323424/10484224), in 11 ms>

<GC(25): mark: 5 ms, sweep: 0 ms, compact: 6 ms>

<GC(25): refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>

<GC(25): moved 214 objects, 9352 bytes, reason=1>

<AF[25]: managing allocation failure, action=3 (323424/10484224)>

<AF[25]: managing allocation failure, action=4 (323424/10484224)>

<AF[25]: managing allocation failure, action=6 (323424/10484224)>

<AF[25]: warning! free memory getting short(1). (323424/10484224)>

<AF[25]: completed in 13 ms>

(30)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 30

Example of

Example of

verbosegc

verbosegc

when heap is too big

when heap is too big

GC is too long

<AF[29]: Allocation Failure. need 2321688 bytes,

88925

ms since last AF>

<AF[29]: managing allocation failure, action=1 (

3235443800/20968372736

)

(3145728/3145728)>

<GC(29): GC cycle started Mon Nov 4 14:46:20 2002

<GC(29): freed 8838057824 bytes, 57% free (12076647352/20971518464), in

4749 ms>

<GC(29): mark: 4240 ms, sweep: 509 ms, compact: 0 ms>

<GC(29): refs: soft 0 (age >= 32), weak 0, final 1, phantom 0>

<AF[29]: completed in

4763

ms>

(31)

Effect of wrong

Effect of wrong

Xms

Xms

&

&

-

-

Xmx

Xmx

settings

settings

Too small heap = Too frequent GC.

Too big heap = Too much GC pause time. (Irrespective of amount of physical memory on the

system)

Heap size > physical memory size = paging/swapping = bad for your application.

It is desirable to have the Xms much less than Xmx if you are encountering fragmentation issues.

This forces class allocations, thread and persistent objects to be allocated at the bottom of the

heap.

What about Xms=Xmx?

It means no heap expansion or shrinkage will ever occur.

Not normally recommended.

It may be good for a few apps which require constant high heap storage space.

Hurts apps which show a varying load.

(32)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 32

Mark Stack Overflow (MSO)

Mark Stack Overflow (MSO)

Verbosegc will contain the line:

<GC(45): mark stack overflow>

Is bad for performance.

Caused by too many objects on the heap, especially deeply nested objects.

Processing MSOs is expensive

No solution is a silver bullet. Things to try:

¾

Decrease the heap size!

¾

Use concurrent mark (-Xgcpolicy:optavgpause)

¾

Re-design the application.

(33)

High pause times and system activity

High pause times and system activity

ƒ

In the event of pause times being usually acceptable with the exception of a few

"abnormally high"

spikes - we are likely to infer that the deviation was a result of

some system level activity (heavy paging for ex) outside of the Java process.

ƒ

Consideration:

How many clock ticks our process actually spent executing

instructions, not time spent waiting for I/O or time spent waiting for a CPU to

become available for the process to run on?

(34)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 34

Headline Changes in Java 5.0 Garbage Collector

Headline Changes in Java 5.0 Garbage Collector

Java Tiger: JSE 5.0

¾

5.0 uses a completely new memory management framework

¾

No pinned/dosed objects

™

Stack Maps used to provide a level of indirection between references and heap

™

5.0 VM never pins arrays, it always makes a copy for the JNI code

¾

The GC is Type Accurate

¾

New efficient parallel compactor

(35)

Garbage collection policies in IBM Java 5.0

Garbage collection policies in IBM Java 5.0

Four policies available:

ƒ

Optthruput (default)

¾

Mark-sweep algorithm

¾

Fastest for many workloads

ƒ

optavgpause

¾

Concurrent collection and concurrent sweep

¾

Small mean pause

¾

Throughput impact

ƒ

gencon

¾

The Generational Hypothesis

¾

Fastest for transactional workloads

¾

Combines low pause times and high throughput

ƒ

subpools

¾

Mark-sweep based, but with multiple freelists

¾

Avoids allocation contention on many-processor machines

(36)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 36

Why different GC Policies?

Why different GC Policies?

Availability of different GC policies gives you increased capabilities.

Best choice depends upon application behaviour and workloads.

Think about throughput, response times & pause times.

Throughput

is the amount of

data processed by the

application

Pause time

is the amount

of time the garbage

collector has stopped

threads while collecting the

heap.

Response time

is the latency

of the application – how

quickly it answers incoming

requests

Policy

Option

( -Xgcpolicy )

Description

Optimize for

throughput

Optthruput

(Default)

It is typically used for applications where

raw throughput is more important than

short GC pauses. The application is

stopped each time that garbage is

collected.

Optimize for

pause times

Optavgpause

Trades high throughput for shorter GC

pauses by performing some of the

garbage collection concurrently. The

application is paused for shorter times.

Generational

Concurrent

gencon

Handles short lived objects differently than

the longer lived.

Supool

subpool

Uses same algorithm similar to the default

policy but employs allocation strategy

suitable for SMPs.

(37)

Runtime Performance Tuning

Runtime Performance Tuning

What policy should I choose for my J9 VM

What policy should I choose for my J9 VM

Policy

Considerations

optthruput

I want my application to run to completion as quickly as possible.

optavgpause

My application requires good response times to unpredictable events.

A degradation in performance is acceptable as long as GC pause times are reduced.

My application is running very large java heaps.

My application is a GUI application and user response times are important.

gencon

My application allocates many short lived objects

The java heap space is fragmented

(38)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 38

Memory Management / Garbage Collection

Memory Management / Garbage Collection

How the IBM J9 Generational Garbage Collector Works

How the IBM J9 Generational Garbage Collector Works

JVM Heap

Nursery/Young Generation

Old Generation

Permanent Space

Sun JVM Only:

-XX:MaxPermSize=nn

IBM J9:

-Xmn (-Xmns/-Xmnx)

Sun:

-XX:NewSize=nn

-XX:MaxNewSize=nn

-Xmn<size>

IBM J9:

-Xmo (-Xmos/-Xmox)

Sun:

-XX:NewRatio=n

• Minor Collection – takes place only in the young generation, normally

done through direct copying Æ very efficient

• Major Collection – takes place in the new and old generation and uses

the normal mark/sweep (+compact) algorithm

(39)

Nursery/Young Generation

Nursery/Young Generation

Nursery/Young Generation

Allocate Space

Survivor Space

Survivor Space

Allocate Space

Nursery is split into two spaces (semi-spaces)

ƒ

Only one contains live objects and is available for allocation

ƒ

Minor collections (Scavenges) move objects between spaces

ƒ

Role of spaces is reversed

(40)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 40

Sample

Sample

verbosegc

verbosegc

output for

output for

gencon

gencon

<af type="nursery" id="35" timestamp="Thu Aug 11 21:47:11 2005" intervalms="10730.361"> <minimum requested_bytes="144" />

<time exclusiveaccessms="1.193" />

<nursery freebytes="0" totalbytes="1226833920" percent="0" />

<tenured freebytes="68687704" totalbytes="209715200" percent="32" > <soa freebytes="58201944" totalbytes="199229440" percent="29" /> <loa freebytes="10485760" totalbytes="10485760" percent="100" /> </tenured>

<gc type="scavenger" id="35" totalid="35" intervalms="10731.054"> <flipped objectcount="1059594" bytes="56898904" />

<tenured objectcount="12580" bytes="677620" /> <refs_cleared soft="0" weak="691" phantom="39" /> <finalization objectsqueued="1216" />

<scavenger tiltratio="90" />

<nursery freebytes="1167543760" totalbytes="1226833920" percent="95" tenureage="14" /> <tenured freebytes="67508056" totalbytes="209715200" percent="32" >

<soa freebytes="57022296" totalbytes="199229440" percent="28" /> <loa freebytes="10485760" totalbytes="10485760" percent="100" /> </tenured>

<time totalms="368.309" /> </gc>

<nursery freebytes="1167541712" totalbytes="1226833920" percent="95" /> <tenured freebytes="67508056" totalbytes="209715200" percent="32" >

<soa freebytes="57022296" totalbytes="199229440" percent="28" /> <loa freebytes="10485760" totalbytes="10485760" percent="100" /> </tenured>

<time totalms="377.634" /> </af>

Allocation request

details, time it took to

stop all mutator threads.

Heap occupancy

details before GC.

Heap occupancy

details after GC.

Details about the

scavenge.

(41)

Diagnostic tool for garbage collector

Diagnostic tool for garbage collector

Diagnostic tool for optimizing parameters affecting the

garbage collector while using IBM JVM

Reads the “verbosegc” output and produces textual and

graphical visualizations and related statistics

Frequency of garbage collection cycles

Time spent in different phases of garbage collection

Quantities of heap memories involved in the

process

Characteristics of allocation failures

Mark Stack Overflows

Built in parsers for JVM 1.5, 1.4.2 , 1.3.1 & 1.2.2

Prerequisite – JFreeChart libraries (freely downloadable)

(42)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 42

Screen Shots

Screen Shots

Starting the tool and selecting verbosegc input file and

appropriate JVM parser

ƒ

Extract GCCollector.zip

ƒ

Place jfreeChart-0.9.21.jar and jcommon-0.9.6.jar in the lib directory of the GCCollector folder

ƒ

Execute GCCollector.bat (which will spawn a GUI)

ƒ

Select verbosegc file for analysis

(43)

Duration of GC Cycles

(44)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 44

Graphical view of heap usage

(45)

Information for specific cycle

(46)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 46

Choosing time range

(47)

Extensible Verbose Tool Kit

Extensible Verbose Tool Kit

Analyzing your verbose GC output

Analyzing your verbose GC output

EVTK: Verbose GC visualizer and analyzer

ƒ

Available through ISA

¾

IBM Support Assistant v3.0.2

https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=isa

ƒ

Reduces barrier to understanding verbose output

¾

Visualize GC data to track trends and relationships

¾

Analyze results and provide general feedback

¾

Extend to consume output related to application

(48)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 48

Starting the EVTK

Starting the EVTK

In the Tools

section of ISA,

select the Java

product plug-in to

display the

available tools

Click on the

name of a tool to

start that tool

(49)

EVTK usage scenarios

EVTK usage scenarios

Investigate performance problems

ƒ

Long periods of pausing or unresponsiveness

Evaluate your heap size

ƒ

Check heap occupancy and adjust heap size if needed

Garbage collection policy tuning

ƒ

Examine GC characteristics, compare different policies

Look for memory growth

ƒ

Heap consumption slowly increasing over time

(50)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 50

Extensible Verbose Toolkit overview

Extensible Verbose Toolkit overview

The Extensible Verbose Toolkit (EVTK) is a visualizer for verbose

garbage collection output

ƒ

The tool parses and plots verbose GC output and garbage collection traces (

-Xtgc

output)

The tooling framework is extensible, and will be expanded over time to

include visualization for other collections of data

The EVTK provides

ƒ

Raw view of data

ƒ

Line plots to visualize a variety of GC data characteristics

ƒ

Tabulated reports with heap occupancy recommendations

ƒ

View of multiple datasets on a single set of axes

(51)

Plotting data with the EVTK

Plotting data with the EVTK

Use File > Open

to open a new

input file

The VGC Data

menu allows

you to choose

what data to

display

The Line plot tab contains

the data visualization

Use File > Add

to add multiple

input files to a

single data set

for comparison

and aggregated

display

The Axes

panel supports

customized

units and

pan-and-zoom

Right-click

on the

plot and use the

context menu

to

(52)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 52

Reports and recommendations

Reports and recommendations

Report contents can be

configured using VGC menu

options

ƒ

Occupancy recommendations

tell you how to adjust heap size

for better performance

ƒ

Summary information is

generated for each input in the

dataset

ƒ

Graphs included for all GC

display data

Can export as HTML by

right-clicking and using the context

menu

The Report tab contains the

(53)

Types of graphs

Types of graphs

The EVTK has built-in support for over forty different types of graphs

ƒ

These are configured in the VGC Data menu

ƒ

Options vary depending on the current dataset and the parsers and

post-processors that are enabled

Some of the VGC graph types are:

Note: Different graph types and a different menu are available for TGC

output

Free tenured heap (after collection)

Tenured heap size

Tenure age

Free LOA (after collection)

Free SOA (after collection)

Total LOA

Total SOA

Used total heap

Pause times (mark-sweep-compact

collections)

Pause times (totals, including exclusive

access)

Compact times

Weak references cleared

(54)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 54

Heap usage and occupancy

Heap usage and occupancy

recommendation

recommendation

This graph shows heap

usage after garbage

collection; it jumps up to

around 60M and stays there

The summary report shows

that mean heap occupancy is

98% and that the application

is spending over a third of its

time doing garbage collection

(55)

EVTK

EVTK

Heap Visualization

Heap Visualization

(56)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 56

EVTK

EVTK

-

-

Comparison & Advice

Comparison & Advice

Compare runs…

(57)

Java 5 Shared Classes

Java 5 Shared Classes

Available on all platforms.

Feature enabled using the –Xshareclasses flag.

Static class data caches in shared memory

ƒ

Shared between all IBM Java VMs

ƒ

All application and bootstrap classes shared

ƒ

Cache persisted beyond lifetime of any JVM, but lost on shutdown/reboot

Provides savings in footprint and start up times.

Target: Server environments where multiple JVMs exist on the same box.

Multiple sharing strategies

ƒ

Standard Classloaders (including Application Classloader) exploit this feature when enabled.

(58)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 58

Shared Classes

Shared Classes

Start up times

Start up times

Sharing helps speed up JVM startup

Startup Improvements with Shared Classes and AOT

0

5

10

15

20

25

30

eclipse 3.2.2

tomcat 5.5

WAS 6.1

S

e

c

onds

default

-Xshareclasses (Java 5.0)" -Xshareclasses (Java 6.0)

Lower is better

(59)

What does real

What does real

-

-

time mean?

time mean?

Real-time = predictability of performance

ƒ

Hard - Violation of timing constraints are hard failures

ƒ

Soft - Timing constraints are simply performance goals

Constraints vary in magnitude (microseconds to seconds)

Consequences of missing a timing constraint:

ƒ

from service level agreement miss (stock trading)

ƒ

to life in jeopardy (airplanes)

Real-fast is not real-time, but Real-slow is not real-good

(60)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 60

WebSphere

WebSphere

Real Time (WRT)

Real Time (WRT)

1.0 generally available in August 2006.

WRT is a Java runtime providing highly predictable operation

ƒ

Real-time garbage collection (Metronome)

ƒ

Static and dynamic compilation

ƒ

Full support for RTSJ (JSR 1)

ƒ

Java SE 5.0 compliant

(61)

Real

Real

-

-

Time Garbage Collection

Time Garbage Collection

The Metronome Garbage Collector

The Metronome Garbage Collector

Utilization

ƒ

Percentage of time dedicated to the application in a

given window of time

Application

Collector

Time

Application receives a minimum

percentage of time to run

10ms

Metronome uses a 10ms sliding window to

measure utilization

(62)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 62

Visual Performance

Visual Performance

Analyser

Analyser

Eclipse based performance visualization kit

Profile Analyzer

Code Analyzer

Pipeline Analyzer

Profiler Analyzer provides a powerful set of graphical and text-based views that

allow users to narrow down performance problems to a particular process,

thread, module, symbol, offset, instruction or source line.

Supports time based system profiles

Code Analyzer examines executable files and displays detailed information

about functions, basic blocks and assembly instructions.

Pipeline Analyzer displays the pipeline execution of instruction traces.

(63)

Profile Analyzer Views

Profile Analyzer Views

The following are important views within Profile Analyzer:

ƒ

Basic Blocks

ƒ

Call-Graph Callers/Descendants

ƒ

Compiler Listing

ƒ

Console

ƒ

Counters

ƒ

Database Connections

ƒ

Disassembly/Offsets

ƒ

Disassembly Comparison

ƒ

Java/Classes Hierarchy

ƒ

Profile Comparison

ƒ

Profile Details

ƒ

Profile Resources

ƒ

Resolved Calls

ƒ

Sample Distribution Chart

ƒ

Source Code

ƒ

Symbol Distribution

(64)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 64

Temporal Profiling View

(65)

Counters : In the Process Hierarchy View

Counters : In the Process Hierarchy View

(

(

View Window

View Window

-

-

> Show View

> Show View

-

-

> Others

> Others

(66)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 66

Summary

Summary

Take Aways from the Session:

ƒ

High level overview of the Java Virtual Machine (JVM) and its key

components.

ƒ

Understanding of different Garbage Collection schemas in the Sovereign &

J9 Virtual Machines and their impact on JVM Runtime performance.

ƒ

Knowledge about using verbosegc outputs effectively to improve application

response times.

ƒ

Introduction to Shared Classes Technology and performance gains.

ƒ

Familiarity with debugging and profiling tools available for JVM and their

effective usage

.

(67)

References & Further Reading

References & Further Reading

www.ibm.com/developerworks/java/library/j-ibmjava2

developers.sun.com/learning/javaoneonline/2007/pdf/TS-2023.pdf

http://www.ibm.com/developerworks/java/library/j-ibmjava4/

http://www-128.ibm.com/developerworks/java/library/j-rtj1/index.html

https://www-950.ibm.com/events/IBMImpact/Impact2007/3977.pdf

(68)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 68

Questions ?

(69)

Merci

Gracias

Obrigado

Grazie

Danke

English

French

Russian Spanish Japanese German Italian Arabic Traditional Chinese Simplified Chinese Hindi Thai

Thank You

Brazilian Portuguese Tamil Korean

Teşekkürler

(70)

2007 WebSphere Technical Conference (Bangalore, India)

© 2007 IBM Corporation 70

Session Evaluation

Session Evaluation

Please complete your session evaluation

References

Related documents

Appendix G: Additional Related Manuscripts .... Characteristics of the US adult population ≥20 years of age by sleep duration ... Characteristics of the US adult population ≥20

To qualify for the service, you must have been assessed by the University Disability Services, and have Book Collection listed on your Personal Learning Support Plan. Book

In Panel A, the estimated coefficients capturing the correlation between returns to lagged investment estimated using OLS and the reallocation probability are

  Contents vii Liquefaction 144 Notes 150 4 Vane test 151 General considerations 151 Equipment and procedures 152 Interpretation 157 Shear strength 162 Rate effects

Questions addressed overall satisfaction, relationships with faculty, the development of specific skills and preparation received in their graduate program, and additional

The Garden State stands to lose its premier agribusiness which generates $780 million of economic impact annually, 7000 jobs, $110 million in federal, state, and local taxes and

hundred and eighty days following the date this Ministerial Regulation comes into force or within one hundred and eighty days following the date the workplace has fifty employees

1) Substance abuse and withdrawal in the critical care setting. Critical Care Clinics. Sarff M, Gold JA. Crit Care Med. Minozzi, Silvia; Amato, Laura; Vecchi, Simona; Davoli,