• No results found

Perl Memory Use. Tim OSCON July 2012

N/A
N/A
Protected

Academic year: 2021

Share "Perl Memory Use. Tim OSCON July 2012"

Copied!
40
0
0

Loading.... (view fulltext now)

Full text

(1)

Perl Memory Use

(2)

Scope of the talk...

Not really "profiling"

No leak detection

No VM, page mapping, MMU, TLB, threads etc

Linux focus

Almost no copy-on-write

(3)

Goals

Give you a top-to-bottom overview

Identify the key issues and complications

Show you useful tools along the way

(4)

Ouch!

$ perl some_script.pl

Out of memory!

$

$ perl some_script.pl

Killed.

$

$ perl some_script.pl

$

Someone shouts: "Hey! My process has been killed!"

$ perl some_script.pl

(5)
(6)

C Program Code int main(...) { ... }

Read-only Data eg “String constants”

Read-write Data un/initialized variables Heap

(not to scale!)

Shared Lib Code \\

Shared Lib R/O Data repeated for each lib

Shared Lib R/W Data //

C Stack (not the perl stack)

(7)

$ perl -e 'system("cat /proc/$$/stat")'

4752 (perl) S 4686 4752 4686 34816 4752 4202496 536 0 0 0 0 0 0 0 20 0 1 0 62673440 123121664 440 18446744073709551615 4194304 4198212 140735314078128 140735314077056 140645336670206 0 0 134 0 18446744071579305831 0 0 17 10 0 0 0 0 0 0 0 0 0 0 4752 111 111 111

$ perl -e 'system("cat /proc/$$/statm")' 30059 441 346 1 0 160 0

$ perl -e 'system("ps -p $$ -o vsz,rsz,sz,size")' VSZ RSZ SZ SZ

120236 1764 30059 640

$ perl -e 'system("top -b -n1 -p $$")' ...

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

13063 tim 20 0 117m 1764 1384 S 0.0 0.1 0:00.00 perl $ perl -e 'system("cat /proc/$$/status")'

...

VmPeak:! 120236 kB

VmSize:! 120236 kB <- total (code, libs, stack, heap etc.) VmHWM:! 1760 kB

VmRSS:! 1760 kB <- how much of the total is resident in physical memory

VmData:! 548 kB <- data (heap) VmStk:! 92 kB <- stack

VmExe:! 4 kB <- code

VmLib:! 4220 kB <- libs, including libperl.so VmPTE:! 84 kB

VmPTD:! 28 kB VmSwap:! 0 kB

(8)

$ perl -e 'system("cat /proc/$$/maps")'

address perms ... pathname

00400000-00401000 r-xp ... /.../perl-5.NN.N/bin/perl 00601000-00602000 rw-p ... /.../perl-5.NN.N/bin/perl 0087f000-008c1000 rw-p ... [heap] 7f858cba1000-7f8592a32000 r--p ... /usr/lib/locale/locale-archive-rpm 7f8592c94000-7f8592e1a000 r-xp ... /lib64/libc-2.12.so 7f8592e1a000-7f859301a000 ---p ... /lib64/libc-2.12.so 7f859301a000-7f859301e000 r--p ... /lib64/libc-2.12.so 7f859301e000-7f859301f000 rw-p ... /lib64/libc-2.12.so 7f859301f000-7f8593024000 rw-p ... ...other libs... 7f8593d1b000-7f8593e7c000 r-xp ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so 7f8593e7c000-7f859407c000 ---p ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so 7f859407c000-7f8594085000 rw-p ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so 7f85942a6000-7f85942a7000 rw-p ... 7fff61284000-7fff6129a000 rw-p ... [stack] 7fff613fe000-7fff61400000 r-xp ... [vdso]

(9)

$ perl -e 'system("cat /proc/$$/smaps")' # note ‘smaps’ not ‘maps’

address perms ... pathname ...

7fb00fbc1000-7fb00fd22000 r-xp ... /.../5.10.1/x86_64-linux/CORE/libperl.so

Size: 1412 kB <- size of executable code in libperl.so Rss: 720 kB <- amount that's in physical memory

Pss: 364 kB Shared_Clean: 712 kB Shared_Dirty: 0 kB Private_Clean: 8 kB Private_Dirty: 0 kB Referenced: 720 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB

... repeated detail for every mapped item ...

Process view: everything exists in sequential contiguous physical memory. Simple. System view: chunks of physical memory are mapped into place and loaded on demand, then taken away again when the process isn't looking.

(10)

C Program Code To the program everything appears to be in physical memory.

In reality that’s rarely the case. Memory is divided into pages

Page size is typically 4KB

Read-only Data

To the program everything appears to be in physical memory.

In reality that’s rarely the case. Memory is divided into pages

Page size is typically 4KB

Read-write Data

To the program everything appears to be in physical memory.

In reality that’s rarely the case. Memory is divided into pages

Page size is typically 4KB

Heap

To the program everything appears to be in physical memory.

In reality that’s rarely the case. Memory is divided into pages

Page size is typically 4KB

← Page ‘resident’ in physical memory

← Page not resident

← Page ‘resident’ in physical memory

← Page not resident

← Page ‘resident’ in physical memory

← Page not resident Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

Shared Lib Code

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

Shared Lib R/O Data

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

Shared Lib R/W Data

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

C Stack

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

System

Pages:

•are loaded when first used

•may be ‘paged out’ when the system needs the physical memory

•may be shared with other processes

•may be copy-on-write, where are shared page becomes private when first written to

(11)

Key Points

Pages of a process can be paged out if the system wants the physical

memory. So

Resident Set Size (RSS)

can shrink even while the

overall process size grows.

Re private/shared/copy-on-write: If a page is currently paged out its

attributes are paged out as well. In this case a page is neither reported

as private nor as shared. It is only included in the process size.

So be careful to understand what you’re actually measuring!

Generally

total memory size is a good indicator.

(12)

Low-Level Modules

BSD::Resource - getrusage() system call (limited on Linux)

BSD::Process - Only works on BSD, not Linux

Proc::ProcessTable - Interesting but buggy

Linux::Smaps - very detailed, but only works on Linux

(13)

Higher-Level Modules

Memory::Usage

Reads

 

/proc/$pid/statm. Reports changes on demand.

Dash::Leak

Uses BSD::Process. Reports changes on demand.

Devel::MemoryTrace::Light

Uses GTop or BSD::Process. Automatically prints a message when

memory use grows, pointing to a particular line number.

(14)

Other Modules

✦ Devel::Plumber - memory leak finder for C programs

✦ Uses GDB to walk internal glibc heap structures. Can work on either a live

process or a core file. Treats the C heap of the program under test as a

collection of non-overlapping blocks, and classifies them into one of four states.

✦ Devel::Memalyzer - Base framework for analyzing program memory usage

✦ Runs and monitors a subprocess via plugins that read /proc smaps and status at

regular intervals.

✦ Memchmark - Check memory consumption

✦ Memchmark forks a new process to run the sub and then monitors its memory

(15)

A Peak

(16)

Heap

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

for some data types

Your data goes here

Perl uses malloc() and free() to manage the space

malloc has its own issues (overheads, bucket sizes,

fragmentation etc. etc.) Perl uses its own malloc

code on some systems

On top of malloc perl has it’s own layer of memory management (e.g. arenas)

(17)
(18)

Data Anatomy Examples

Integer

(IV)

String

(PV)

Number

with a

string

(19)

Array

(IV)

Hash

(HV)

(20)

Glob (GV) Symbol Table (Stash)

(21)

Notes

All Heads and Bodies are allocated from arenas managed by perl

efficient, low overhead and no fragmentation

but arena space for a given data type is never freed or repurposed

All variable length data storage comes from malloc

higher overheads, bucket and fragmentation issues

Summing the “apparent size” of a data structure will underestimate

(22)

Arenas

$ perl -MDevel::Gladiator=arena_table -e 'warn arena_table()' ARENA COUNTS: 1063 SCALAR 199 GLOB 120 ARRAY 95 CODE 66 HASH 8 REGEXP 5 REF 4 IO::File 3 REF-ARRAY 2 FORMAT 1 version 1 REF-HASH 1 REF-version

arena_table()

formats the hash return by

arena_ref_counts()

which

(23)

Devel::Peek

Gives you a textual view of the data structures

$ perl -MDevel::Peek -e '%a = (42 => "Hello World!"); Dump(\%a)' SV = IV(0x1332fd0) at 0x1332fe0 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x1346730 SV = PVHV(0x1339090) at 0x1346730 REFCNT = 2 FLAGS = (SHAREKEYS) ARRAY = 0x1378750 (0:7, 1:1) hash quality = 100.0% KEYS = 1 FILL = 1 MAX = 7 RITER = -1 EITER = 0x0

Elt "42" HASH = 0x73caace8

SV = PV(0x1331090) at 0x1332de8 REFCNT = 1

FLAGS = (POK,pPOK)

PV = 0x133f960 "Hello World!"\0

(24)

Devel::Size

Gives you a measure of the size of a data structures

$ perl -MDevel::Size=total_size -Minteger -le 'print total_size( 0 )' 24

$ perl -MDevel::Size=total_size -Minteger -le 'print total_size( [] )' 64

$ perl -MDevel::Size=total_size -Minteger -le 'print total_size( {} )' 120

$ perl -MDevel::Size=total_size -le 'print total_size( [ 1..100 ] )' 3264

Makes somewhat arbitrary decisions about what to include for non-data types

Doesn't or can't accurately measure subs, forms, regexes, and IOs.

Can't measure 'everything' (total_size(\%main::) is the best we can do)
(25)

Space in Hiding

Perl tends to use memory to save time

This can lead to surprises, for example:

sub foo { my $var = "#" x 2**20; }

foo(); # ~1MB still used after return

sub bar{

my $var = "#" x 2**20;

bar($_[0]-1) if $_[0]; # recurse

}

(26)

Devel::Size 0.77

perl -MDevel::Size=total_size -we '

sub foo { my $var = "#" x 2**20; foo($_[0]-1) if $_[0]; 1 } system("grep VmData /proc/$$/status");

printf "%d kB\n", total_size(\&foo)/1024; foo(50);

system("grep VmData /proc/$$/status");

printf "%d kB\n", total_size(\&foo)/1024; ' VmData:! 796 kB 7 kB VmData:! 105652 kB 8 kB

VmData grew by ~100MB but we expected ~50MB. Not sure why.
(27)

Devel::Size 0.77

+ hacks

perl -MDevel::Size=total_size -we '

sub foo { my $var = "#" x 2**20; foo($_[0]-1) if $_[0];1 } system("grep VmData /proc/$$/status");

printf "%d kB\n", total_size(\&foo)/1024; foo(50);

system("grep VmData /proc/$$/status");

printf "%d kB\n", total_size(\&foo)/1024; ' VmData:! 796 kB 293 kB VmData:! 105656 kB 104759 kB

Now does include the pad variables.
(28)

Devel::Size 0.77

+ hacks

$ report='printf "total_size %6d kB\n", total_size(\%main::)/1024; system("grep VmData /proc/$$/status")'

$ perl -MDevel::Size=total_size -we “$report” total_size 290 kB

VmData: 800 kB

$ perl -MMoose -MDevel::Size=total_size -we “$report” total_size 9474 kB! [ 9474-290 = + 9184 kB ]

VmData: 11824 kB! [ 11824-800 = +11024 kB ]

What accounts for the 1840 kB difference in the increases?

-

Arenas and other perl-internals aren't included

-

Limitations of Devel::Size measuring subs and regexs

(29)

Malloc and

The Heap

(30)

“Malloc and

The Heap”

(31)

Heap

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

Requests big chunks of memory from the operating

system as needed.

Almost never returns it! Perl makes lots of alloc

and free requests.

Freed fragments of various sizes accumulate.

perl data

(32)

$ man malloc

"When allocating blocks of memory larger than MMAP_THRESHOLD

bytes, the glibc malloc() implementation allocates the memory as a private

anonymous mapping using mmap(2). MMAP_THRESHOLD is 128 kB

by default, but is adjustable using mallopt(3)."

That's for RHEL/CentOS 6. Your mileage may vary.

Space vs speed trade-off: mmap() and munmap() probably slower.

Other malloc implementations can be used via LD_PRELOAD env var.

e.g.

export LD_PRELOAD="/usr/lib/libtcmalloc.so"
(33)

PERL_DEBUG_MSTATS

*

* Requires a perl configured to use it's own malloc (-Dusemymalloc)

$ PERL_DEBUG_MSTATS=1 perl -MMoose -MDevel::Size=total_size -we "$report" total_size 9474 kB! [ 9474-290 = + 9184 kB ]

VmData: 11824 kB! [ 11824-800 = +11024 kB ]

Memory allocation statistics after execution: (buckets 8(8)..69624(65536) 429248 free: 225 125 69 25 18 1 3 6 0 6 1 23 0 0

! 0 9 26 10

6302120 used: 795 14226 2955 3230 2190 1759 425 112 30 862 11 2 1 2

! 0 1606 8920 4865

Total sbrk(): 6803456/1487:-13. Odd ends: pad+heads+chain+tail: 2048+70040+0+0

There's 419 kB ("429248 free") is sitting in unused malloc buckets.

See perldebguts and Devel::Peek docs for details. Also Devel::Mallinfo.
(34)

Key Notes

Perl uses malloc to manage heap memory

Malloc uses sized buckets and free lists etc.

Malloc has overheads

Freed chunks of various sizes accumulate

Large allocations may use mmap()/munmap()

Your malloc maybe tunable

(35)
(36)

What does that mean?

Track memory size over time?

"Memory went up 53 kB while in sub foo"

Has to be done by internals not proc size

Experimental NYTProf patch by Nicholas

Measured memory instead of CPU time

Turned out to not seem very useful

(37)
(38)
(39)

The D

r

aft Plan

Add a function to Devel::Size to return the size of

everything

.

including arenas and malloc overheads (where knowable)

try to get as close to VmData value as possible

Add a C-level callback hook

Add some kind of "data path name" chain for the callback to use

Add multi-phase scan

1: start via symbol tables, note & skip where ref count > 1

2: process all the skipped items (ref chains into unnamed data)

3: scan arenas for leaked values (not seen in scan 1 or 2)

Write all the name=>size data to disk

(40)

Questions?

[email protected]

http://blog.timbunce.org

Further info on unix.stackexchange.com BSD::Resource - getrusage() system call (limited on Linux) BSD::Process - Only works on BSD, not Linux Proc::ProcessTable - Interesting but buggy Linux::Smaps - very detailed, but only works on Linux GTop - Perl interface to libgtop, better but external dependency Memory::Usage Dash::Leak Devel::MemoryTrace::Light Devel::Plumber - memory leak finder for C programs Devel::Memalyzer - Base framework for analyzing program memory usage Memchmark - Check memory consumption Illustrations from illguts See perldebguts and Devel::Peek docs for details. Also Devel::Mallinfo. http://blog.timbunce.org

References

Related documents