• No results found

Java Garbage Collection Best Practices for Sizing and Tuning the Java Heap

N/A
N/A
Protected

Academic year: 2021

Share "Java Garbage Collection Best Practices for Sizing and Tuning the Java Heap"

Copied!
55
0
0

Loading.... (view fulltext now)

Full text

(1)

Java Garbage Collection

Best Practices for Sizing and Tuning the Java Heap

(2)

Objectives

Overview

Selecting the Correct GC Policy

Sizing the Java

heap

(3)

Garbage Collection Performance

GC performance issues can take many forms

Definition of a performance problem is user centric User requirement may be for:

• Very short GC “pause” times

• Maximum throughput

• A balance of both

First step is ensure that the correct GC policy has been selected for the workload type

Helpful to have an understanding of GC mechanisms Second step is to ensure heap sizing is correct

(4)

WebSphere® Support Technical Exchange

(5)

Understanding Garbage Collection

Responsible for allocation and freeing of:

Java objects, Array objects and Java classes

Allocates objects using a contiguous section of Java heap Ensures the object remains as long as it is in use or “live”

Determination based on a reference from another “live” object or from outside of the Heap

Reclaims objects that are no longer referenced

Ensures that any finalize method is run before the object is reclaimed

(6)

Object Allocation

Requires a contiguous area of Java heap Driven by requests from:

The Java application JNI code

Most allocations take place in Thread Local Heaps (TLHs) Threads reserve a chunk of free heap to allocate from

• Reduces contention on allocation lock

• Keeps code running in a straight line (fewer failures)

• Meant to be fast

Available for objects < 512 bytes in size

Larger allocates take place under a global “heap lock”

These allocations are one time costs – out of line allocate Multiple threads allocating larger objects at the same time will contend

(7)

Occurs under two scenarios: An “allocation failure”

• An object allocation is requested and not enough contiguous memory is available

A programmatically requested garbage collection cycle

• call is made to System.GC() or Runtime.GC()

• the Distributed Garbage Collector is running

• call to JVMPI/TI is made

Two main technologies used to remove the garbage: Mark Sweep Collector

Copy Collector

(8)

Global Collection Policies

Garbage Collection can be broken down into 2 (3) steps

Mark: Find all live objects in the system

Sweep: Reclaim unused heap memory to the free list

Compact: Reduce fragmentation within the free list All steps are in a single stop-the-world (STW) phase

Application “pauses” whilst garbage collection is done Each step is performed as a parallel task within itself Four GC “Policies”, optimized for different scenarios

-Xgcpolicy:optthruput optimized for “batch” type applications

-Xgcpolicy:optavgpause optimized for applications with responsiveness criteria

-Xgcpolicy:gencon optimized for highly transactional workloads -Xgcpolicy:subpools optimized for large systems with allocation

(9)

Parallel Mark Sweep Collector, with compaction avoidance

Created to make use of additional processors on server systems Designed to increase performance for SMP and not degrade performance for uni-processor systems

Optimized for “Throughput”

Best policy for “batch” type applications Consists of a single “flat” Java heap:

(10)

Parallelism achieved through the use of “GC Helper Threads” “Parked” set of threads that wake to share GC work

Main GC thread generates the root set of objects

Helper threads share the work for the rest of the phases

Number of helpers is one less than the number of processing

units

So helper threads and main GC thread equals the number of processing units

Configurable using -Xgcthreads

GC Helper Threads

(11)
(12)

Reduces and makes more consistent the time spent inside Stop the World GC

Reduction usually between 90 and 95%

Achieved by carrying out some of the STW work whilst application is running

1.4.2: Concurrent Marking

5.0: Concurrent Marking and Concurrent Sweeping

Slight overhead on thruput for greatly reduced STW times Policy is ideal for systems with responsiveness criteria

eg. Portal applications

(13)

Parallel and Concurrent Mark/Sweep

(14)

Concurrent Mark – hidden object issue

(15)

Higher heap usage…

…because not all garbage removed

Concurrent Mark – hidden object issue

(16)

Similar in concept to that used by Sun and HP

Parallel copy and concurrent global collects by default Motivation: Objects die young so focus collection efforts on recently created objects

Divide the heap up into a two areas: “new” and “old” Perform allocates from the new area

Collections focus on the new area

Objects that survive a number of collects in new area are promoted to old area (tenured)

Ideal for transactional and high data throughput workloads

Generational and Concurrent GC (gencon)

0 GB 2 GB

Heap Base Heap Size Heap Limit

LOA Nursery (new) Space Tenured (old) Space

(17)

Allocate Space Survivor Space

Nursery is split into two spaces (semi-spaces)

Only one contains live objects and is available for allocation Minor collections (Scavenges) move objects between spaces Role of spaces is reversed

Nursery/Young Generation

Survivor Space Allocate Space

Movement results in implicit compaction

(18)

Subpooling (subpool)

Goals:

Reduce allocation lock contention by distributing free memory into multiple lists

Reduce allocation contention through use of atomic operations instead of a heap lock

Prevent premature garbage collections by using a “best fit” (or closer to best fit) policy instead of address ordered

Ideal for very large SMP systems where large amounts data is being allocated

(19)

Looking for Heap Lock Contention

All locks can be profiled using Java Lock Analyzer (JLA)

http://www.alphaworks.ibm.com/tech/jla

(AlphaWorks)

Provides time accounting and contention statistics for

Java and JVM locks

Functionality includes:

• Counters associated with contended locks

• Total number of successful acquires

• Recursive acquires – times a thread acquires a lock it

already owns

(20)

JLA Sample Report

System (Registered) Monitors

%MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME

87 5273 5273 4572 0 710708 18487 1 95408 JITC Global_Compile lock 9 6870 6869 631 1 113420 2976 0 11807 Heap lock

5 1123 1123 51 0 11098 286 1 248385 Binclass lock 0 1153 1147 5 6 1307 33 0 47974 Monitor Cache lock 0 46149 45877 134 272 36961 877 1 6558 JITC CHA lock 0 33734 23483 19 10251 6544 150 1 17083 Thread queue lock

0 5 5 0 0 0 0 0 9309689 JNI Global Reference lock 0 5 5 0 0 0 0 0 9283000 JNI Pinning lock

0 5 5 0 0 0 0 0 9442968 Sleep lock

0 1 1 0 0 0 0 0 0 Monitor Registry lock 0 0 0 0 0 0 0 0 0 Evacuation Region lock 0 0 0 0 0 0 0 0 0 Method trace lock 0 0 0 0 0 0 0 0 0 Classloader lock 0 0 0 0 0 0 0 0 0 Heap Promotion lock Java (Inflated) Monitors

%MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME

15 68 68 10 0 2204 56 2 11936405 test.lock.testlock1@A09410/A09418

2 42 42 1 0 186 5 0 300478 test.lock.testlock2@D31358/D31360

(21)
(22)

Choosing the Right GC Policy

Four GC “Policies”, optimized for different scenarios

-Xgcpolicy:optthruput optimized for “batch” type applications -Xgcpolicy:optavgpause optimized for applications with

responsiveness criteria

-Xgcpolicy:gencon optimized for highly transactional workloads

-Xgcpolicy:subpools optimized for large systems with allocation contention

How do I know whether to use “optavgpause” or “gencon”? Monitor GC activity

(23)

Monitoring GC Activity

Use of Verbose GC logging

only data that is required for GC performance tuning

Graph Verbose GC output using GC and Memory Visualizer (GCMV) from ISA Activated using command line options

-verbose:gc

-Xverbosegclog:[DIR_PATH][FILE_NAME],X,Y

where:

[DIR_PATH] is the directory where the file should be written [FILE_NAME] is the name of the file to write the logging to

X is the number of files to

Y is the number of GC cycles a file should contain

Performance Cost:

(very) basic testing shows a 2% overhead for GC duration of 200ms

(24)

Important Characteristics for Choosing GC Policy

Rate of Garbage Collection

High rates of object “burn” point to large numbers of transitional objects, and therefore the application may well benefit from the use of gencon

Large Object Allocations?

The allocation of very large objects adversely affects gencon unless the nursery is sufficiently large enough. The application may well benefit from optavgpuse

Large heap usage variations

The optavgpause algorithms are best suited to consistent allocation profiles Where large variations occur, gencon may be better suited

(25)

Rate of Garbage Collection

optavgpause gencon

(26)

Rate of Garbage Collection

Gencon provides less frequent long Garbage Collection cycles Gencon provides a shorter longest Garbage Collection cycle

(27)

Large Object Allocations

(Very) Large Object allocations affects the gencon GC policy

If object is larger than the Nursery size, the object is immediately tenured

• Removes the benefit of generational heaps

• Still has the additional overhead of running generational

If object is fits in the nursery but fills it, frequent nursery collects will have to occur

• Too frequent nursery collects mean objects are likely to survive and need copying

• Copying is an expensive process

(28)

WebSphere® Support Technical Exchange

(29)

Sizing the Java Heap

Maximum possible Java heap sizes

The “correct” Java heap size

Fixed heap sizes vs. Variable heap sizes

Heap Sizing for Generational GC

(30)

Maximum Possible Heap Size

32 bit Java processes have maximum possible heap size Varies according to the OS and platform used

Determined by the process memory layout 64 bit processes do not have this limit

Limit exists, but is so large it can be effectively ignored Addressability usually between 2^44 and 2^64

(31)

An Operating System process like any other application:

Subject to OS and architecture restrictions 32bit architecture has an addressable range of:

• 2^32 which is 0x00000000 – 0xFFFFFFFF

• which is 4GB

Not all addressable space is available to the application

The operating system needs memory for:

• The kernel

• The runtime support libraries

Varies according to Operating System

• How much memory is needed and where that memory is located

0 GB 4 GB 0x0 0xFFFFFFFF 2 GB 0x80000000 1 GB 3 GB 0x40000000 0xC0000000

(32)

Memory Available to the Java Process

On Windows®: On AIX®: 0 GB 4 GB 0x0 0xFFFFFFFF 2 GB 0x80000000 1 GB 3 GB 0x40000000 0xC0000000 0 GB 4 GB 0x0 0xFFFFFFFF 2 GB 0x80000000 1 GB 3 GB 0x40000000 0xC0000000

Operating System Space Libraries

(33)

Java Process Restrictions

Not all Java Process space is available to the Java application

The Java Runtime needs memory for:

• The Java Virtual Machine

• Backing resources for some Java objects

This memory area as well as some other allocations, is part of the “Native” Heap

Memory not allocated to the Java Heap is available to the native heap

Available memory space – Java heap = native

heap

(34)

The “Native” Heap

Allocated using malloc() and therefore subject to memory management by the OS

Used for Virtual Machine resources, eg:

Execution engine Class Loader

Garbage Collector infrastructure

Used to underpin Java objects:

Threads, Classes, AWT objects, ZipFiles

(35)

Native Heap available to Application

On Windows

On AIX (1.4.2 with small heaps)

0 GB 4 GB 0x0 2 GB 0x80000000 1 GB 3 GB 0x40000000 0xC0000000

Operating System Space Libraries

Java Heap

0xFFFFFFFF

0 GB 2 GB 4 GB

1 GB 3 GB

Kernel Java Heap Libraries

VM Resources

Native Heap Native Heap

(36)

Layout with Large Java Heaps on AIX

Applies to heaps > 1GB in size and Java 5.0 Java heap becomes allocated using mmap()

Segments used start at 0xC and work downwards

understanding memory layout important for monitoring

0 GB 4 GB 0x0 0xFFFFFFFF 2 GB 0x80000000 1 GB 3 GB 0x40000000 0xC0000000 Kernel Libraries VM Resources 0x7 Native Heap 0xD 0x3 Java Heap

(37)

Linux®:

z/OS®:

Memory Layout for Linux

0 GB 4 GB 0x0 2 GB 0x80000000 1 GB 3 GB 0x40000000 0xC0000000 Kernel Java Heap 0xFFFFFFFF VM Resources Native Heap PAGE_OFFSET TASK_SIZE 0 GB 0x0 2 GB 0x7FFFFFFF 1 GB 0x40000000 Java Heap

(38)

Theoretical and Advised Max Heap

Sizes

The larger the Java heap, the more constrained the native heap

Advised limits to prevent native heap from becoming overly restricted, leading to OutOfMemoryErrors

Exceeding advised limits possible, but should be done only when native heap usage is understood

Native heap usage can be measured using OS tools:

•Svmon (AIX), PerfMon (Windows), RMF (zOS) etc

1.8GB 1.8GB /3GB 2.5GB 3 GB Hugemem Kernel Advised Maximum Maximum Possible Additional Options Platform 1.3GB 1.7GB z/OS 1.5GB 1.8GB Windows 1.5GB 2 GB Linux 2.5GB 3.25 GB automatic AIX

(39)

Moving to 64bit

Moving to 64bit remove the Java heap size limit However, ability to use more memory is not “free”

64bit applications perform slower

• More data has to be manipulated

• Cache performance is reduced

64bit applications require more memory

• Java Object references are larger

• Internal pointers are larger

(40)

The “correct” Java heap size

GC will adapt heap size to keep occupancy between 40% and 70% Heap occupancy over 70% causes frequent GC cycles

• Which generally means reduced performance

Heap occupancy below 40% means infrequent GC cycles, but cycles longer than they needs to be

• Which means longer pause times that necessary

• Which generally means reduced performance

The maximum heap size setting should therefore be 43% larger than the maximum occupancy of the application

Maximum occupancy + 43% means occupancy at 70% of total heap

• Eg. For 70MB occupancy, 100MB Max heap required, which is 70MB + 43% of 70MB

(41)

Long Garbage Collection Cycles

Too Frequent Garbage Collection

The “correct” Java heap size

M em or y 70% 40% Heap Occupancy Heap Size

(42)

Fixed heap sizes vs. Variable heap sizes

Should the heap size be “fixed”?

i.e. Minimum heap size (-Xms) = Maximum heap size (-Xmx)? Each option has advantages and disadvantages

As for most performance tuning, you must select which is right for the particular application

Variable Heap Sizes

GC will adapt heap size to keep occupancy between 40% and 70%

• Expands and Shrinks the Java heap

Allows for scenario where usage varies over time

• Where variations would take usage outside of the 40-70% window

Fixed Heap Sizes

(43)

Heap Expansion and Shrinkage

Act of heap expansion and shrinkage is relatively “cheap”

However, a compaction of the Java heap is sometimes required

Expansion: for some expansions, GC may have already

compacted to try to allocate the object before expansion

Shrinkage: GC may need to compact to move objects from the

area of the heap being “shrunk”

Whilst expansion and shrinkage optimizes heap occupancy, it (usually) does so at the cost of compaction cycles

(44)

Conditions for Heap Expansion

Not enough free space available for object allocation after GC has complete

Occurs after a compaction cycle

Typically occurs where there is fragmentation or during rapid occupancy growth (i.e., application startup)

Heap occupancy is over 70% Compaction unlikely

More than 13% of time is spent in GC Compaction unlikely

(45)

Conditions for Heap Shrinkage

Heap occupancy is under 40% And the following is not true:

Heap has been recently expanded (last 3 cycles) GC is a result of a System.GC() call

Compaction occurs if:

An object exists in the area being shrunk GC did not shrink on the previous cycle Compaction is therefore likely to occur

(46)

Introduction to –Xmaxf and –Xminf

The –Xmaxf and –Xminf settings control the 40% and 70% occupancy bounds

-Xmaxf: the maximum heap space free before shrinkage (default is 0.6

for 40%)

-Xminf: the minimum heap space before expansion (default is 0.3 for

70%)

Can be used to “move” optimum occupancy window if required by the application

eg. Lower heap utilization required for more infrequent GC cycles Can be used to prevent shrinkage

-Xmaxf1.0 would mean shrinkage only when heap is 100% free Would completely remove shrinkage capability

(47)

Introduction to –Xmaxe and -Xmine

The –Xmaxe and –Xmine settings control the bounds of the size of each expansion step

-Xmaxe: the maximum amount of memory to add to the heap

size in the case of expansion (default is unlimited)

-Xmine: the minimum amount of memory to add to the heap

size in the case of expansion (default is 1MB)

Can be used to reduce/prevent compaction due to expansion Reduce expansions by setting a large -Xmine

(48)

GC Managed Heap Sizing

Long Garbage Collection Cycles

To Frequent Garbage Collection

M em or y Time -Xminf -Xmaxf Heap Occupancy Heap Size Expansion (>= -Xmine)

(49)

Fixed or Variable??

Again, dependent on application For “flat” memory usage, use fixed

For widely varying memory usage, consider variable Variable provides more flexibility and ability to avoid OutOfMemoryErrors

Some of the disadvantages can be avoided:

• -Xms set to lowest steady state memory usage prevents

expansion at startup

• -Xmaxf1 will remove shrinkage

• -Xminf can be used to prevent compaction before

expansion

(50)

Nursery Tenured

Options Are:

Fix both nursery and tenured space

Allow them to expand/contract

General Advice:

Fix the new space size

Size the tenured space as you would for a “flat” heap

(51)

Sizing the Nursery

“Copying” from Allocate to Survivor or to Tenured space is expensive

Physical data is copied (similar to compaction with is also expensive Ideally “survival” rates should be as low as possible

Less data needs to be copied

Less tenured/global collects that will occur The larger the nursery:

the greater the time between collects the less objects that should survive

However, the longer a copy can potentially take

(52)

Summary

GC Policy should be chosen according to application scenario Java heap should ideally be sized for between 40 and 70% occupancy

(53)

Discover the latest trends in WebSphere Technology and implementation, participate in technically-focused briefings, webcasts and podcasts at:

http://www.ibm.com/developerworks/websphere/community/

Learn about other upcoming webcasts, conferences and events:

http://www.ibm.com/software/websphere/events_1.html

Join the Global WebSphere User Group Community: http://www.websphere.org

Access key product show-me demos and tutorials by visiting IBM Education Assistant:

http://www.ibm.com/software/info/education/assistant

View a Flash replay with step-by-step instructions for using the Electronic Service Request (ESR) tool for submitting problems electronically:

http://www.ibm.com/software/websphere/support/d2w.html

Sign up to receive weekly technical My support emails:

(54)

Additional Java Product Resources

Obtain Java Documentation:

https://www.ibm.com/developerworks/java/jdk/docs.html

Download the IBM Java SDKs:

https://www.ibm.com/developerworks/java/jdk/index.html

Find and download Java tooling:

http://www.ibm.com/software/websphere/events_1.html

Troubleshoot Java with the IBM Guided Activity Assistant:

http://www-01.ibm.com/support/docview.wss?uid=swg27010135

Troubleshoot Java with the Guided Troubleshooting InfoCenter

http://publib.boulder.ibm.com/infocenter/javasdk/tools/topic/com.ibm.java.doc.tools.welc ome/tools/welcome/welcome.html

Discuss IBM Java:

(55)

References

Related documents