• No results found

z/os Unix System Services Dumps - Dump Debugging for Dummies

N/A
N/A
Protected

Academic year: 2021

Share "z/os Unix System Services Dumps - Dump Debugging for Dummies"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

© 2011 IBM Corporation

99. z/OS Guide Lahnstein 16.März 2011

z/OS Unix System Services Dumps

-Dump Debugging for Dummies

Matthias Korn

z/OS Virtual Frontend / Unix System Services EMEA Level 2

IBM Deutschland GmbH

(2)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

2 © Copyright IBM Corporation 2011

What are we talking about today?

The two categories of dumps

How to capture an unformatted dump

IPCS – powerful tool to read unformatted dumps

IPCS – First steps to navigate

IPCS – Next steps to navigate

IPCS – Useful general commands to gather information

BPXI070E at shut down – Finding the root using a SLIP dump

Hiper Apar OA34226 – What does a dump show in this case?

OMVS Debug HTML Update

(3)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

3 © Copyright IBM Corporation 2011

The two categories of dumps

There are two categories of dumps:

Formatted dumps

SYSABEND, SYSUDUMP, SNAP dumps

Unformatted dumps

SVC dumps, SYSMDUMP abend dumps, stand-alone

dumps

(4)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

4 © Copyright IBM Corporation 2011

How to capture an unformatted dump

System abends – i.e. AbendEC6, abend0C4, abend878

dump captured by recovery routines

Slip – i.e. reason code slip trap under USS

slip processing gets control due to the defined conditions

slip schedules SVC dump, captures trace records

dynamic dump – i.e. console dump

dump captured via DUMP command

no trigger necessary

(5)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

5 © Copyright IBM Corporation 2011

How to capture an unformatted dump (cont.)

SADUMP program

standalone dump program loaded as part of a restart

SADUMP captured in hang / loop situations

SYSMDUMP DD card

dump captured in connection with LE runtime options such

as TER(UADUMP), ABT(ABEND), TRAP(ON)

(6)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

6 © Copyright IBM Corporation 2011

IPCS – powerful tool to read unformatted dumps

problem state key 8 program running in TSO/E users address

space

operates interactively and in batch environments

a TSO/E command processor is the base of IPCS

TSO/E 'IPCS' command activates the IPCS command

processor

all commands to perform IPCS functions are sub-commands of

the IPCS command

for interactive use, IPCS uses ISPF dialog support to run as a

full screen application

(7)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

7 © Copyright IBM Corporation 2011

IPCS (cont.)

helps you to …

format and read component traces, GTF traces

format and analyze unformatted dumps

Format and display control blocks

is able to identify …

jobs with error return codes

resource contentions

(8)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

8 © Copyright IBM Corporation 2011

IPCS – First steps to navigate

What kind of dump do we have?

What was the dump written for?

(9)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

9 © Copyright IBM Corporation 2011

(10)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

10 © Copyright IBM Corporation 2011

(11)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

11 © Copyright IBM Corporation 2011

(12)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

12 © Copyright IBM Corporation 2011

(13)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

13 © Copyright IBM Corporation 2011

IPCS – Next steps to navigate

Which address spaces have been dumped?

What are the corresponding jobnames?

(14)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

14 © Copyright IBM Corporation 2011

IPCS CBF RTCT (IP CBF RTCT)

(15)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

15 © Copyright IBM Corporation 2011

(16)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

16 © Copyright IBM Corporation 2011

IPCS LIST E0. LENGTH(16) BLOCK(0)

Lists the SDRSN – SDUMP PARTIAL DUMP REASON CODE control block

If all requested bytes are x'0', the dump is complete. Otherwise SDRSN control block in

z/OS MVS Data Areas Volume 5 (MCSCSA – SNAPX) needs to be reviewed for the

actual reason.

(17)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

17 © Copyright IBM Corporation 2011

IPCS – Useful general commands

Which trace data are available?

Does any resource contention exist?

How many real storage is available / in use?

Which events (abends) have been logged?

What can be determined about OMVS?

(18)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

18 © Copyright IBM Corporation 2011

IPCS VERBX MTRACE

The MTRACE verb exit displays the master trace table which corresponds to the syslog

of your image. The status of it can be determined via 'D TRACE' and changed via

(19)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

19 © Copyright IBM Corporation 2011

IPCS SYSTRACE ASID(1) TIME(LOCAL)

The SYSTRACE IPCS command displays the system trace table and formats system

trace entries for each address space. The status of it can be determined via 'D TRACE'

and changed via 'TRACE ST' operator command.

(20)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

20 © Copyright IBM Corporation 2011

IPCS ANALYZE RESOURCE

(21)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

21 © Copyright IBM Corporation 2011

IPCS RSMDATA SUMMARY

(22)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

22 © Copyright IBM Corporation 2011

IPCS VERBX LOGDATA

Shows the instorage logrec buffers. It invokes the EREP program to format the logrec

records.

(23)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

23 © Copyright IBM Corporation 2011

IPCS OMVSDATA

Formats OMVS relevant information about processes, threads, files and file systems

managed by OMVS and serviced by HFS, ZFS, NFS, TFS.

The dump needs to contain the OMVS address space and OMVS data spaces

Options:

IP OMVSDATA

IP OMVSDATA PROCESS

IP OMVSDATA FILE

IP OMVSDATA STORAGE

IP OMVSDATA IPC

IP OMVSDATA COMMUNICATION

Report Types:

SUMMARY

DETAIL

EXCEPTION

(24)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

24 © Copyright IBM Corporation 2011

IPCS OMVSDATA PROCESS

Displays a Unix System Services process summary report including PID, associated user

ID, ASID, parent process ID and status (i.e. zombie).

(25)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

25 © Copyright IBM Corporation 2011

IPCS OMVSDATA PROCESS DETAIL

Displays a detailed report about each process dubbed to Unix System Services including

its different threads (TCBs), active system calls, open file descriptors and sent / received

sysplex work.

(26)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

26 © Copyright IBM Corporation 2011

IPCS OMVSDATA PROCESS DETAIL (cont.)

Displays a detailed report about each process dubbed to Unix System Services including

its different threads (TCBs), active system calls, open file descriptors and sent / received

sysplex work.

(27)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

27 © Copyright IBM Corporation 2011

(28)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

28 © Copyright IBM Corporation 2011

IPCS OMVSDATA FILE

Displays a report of all mounted file systems known to that system the dump was taken

for including file system name, mount point, latch number, token to internal control blocks

representing the file system.

(29)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

29 © Copyright IBM Corporation 2011

IPCS OMVSDATA FILE DETAIL

Displays a report of all active files in the system. An active file is either open or has

recently been referenced. The 'File Serial Number' and the 'Device Number' uniquely

identify a file (directory, regular file, character special, FIFO, symbolic link).

(30)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

30 © Copyright IBM Corporation 2011

IPCS OMVSDATA STORAGE

Displays a report of all active cell pools in use by z/OS Unix. The report contains

information about common storage and data space resident cell pools as well as private

storage resident cell pools.

(31)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

31 © Copyright IBM Corporation 2011

IPCS CTRACE COMP(SYSOMVS) FULL LOCAL

Formats out the OMVS component trace. The trace data reside in SYSZBPX1 data

space, which makes it necessary to always include the OMVS dataspaces into a dump.

The trace is at least active in MINIMUM mode – for OMVS related problems it is always

recommended to activate the trace. For details see the USS Diagnosis HTML file.

(32)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

32 © Copyright IBM Corporation 2011

IPCS CTRACE QUERY(SYSOMVS) FULL LOCAL

(33)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

33 © Copyright IBM Corporation 2011

BPXI070E at shut down – using a slip dump

Symptoms:

*BPXI066E OMVS SHUTDOWN COULD NOT MOVE OR UNMOUNT ALL FILE SYSTEMS

BPXM054I FILE SYSTEM OMVS.ETC.MSYX

FAILED TO UNMOUNT.

RET CODE = 00000072, RSN CODE =

058800AA

BPXM054I FILE SYSTEM SYS1.ROOT.MSYX.OMVSSIDA FAILED TO UNMOUNT.

RET CODE = 00000072, RSN CODE =

058800AA

*195 BPXI070E USE SETOMVS ON ANOTHER SYSTEM TO MOVE NEEDED

FILE SYSTEMS, THEN REPLY WITH ANY KEY TO CONTINUE SHUTDOWN

(34)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

34 © Copyright IBM Corporation 2011

BPXI070E at shut down (cont.)

TSO BPXMTEXT 058800AA

BPXFSUMT 03/05/08

JRFsParentFs: The file system has file systems mounted on it.

Action: An unmount request can be honored only if there are no

file systems mounted anywhere on the requested file system.

Use the

F BPXOINIT,FILESYS=DISPLAY,ALL

command for a shared file

system configuration or the D OMVS,FILE command for a non-shared

file system configuration to determine which file systems are

mounted on the requested file system.

(35)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

35 © Copyright IBM Corporation 2011

BPXI070E at shut down (cont.)

SLIP SET,IF,A=SYNCSVCD,RANGE=(10?+8C?+F0?+1F4?),

DATA=(13R??+1B0,EQ,058800AA),DSPNAME=('OMVS'.*),

SDATA=(ALLNUC,PSA,CSA,LPA,TRT,SQA,LSQA,RGN,SUM),

JL=OMVS,AL=(H,P,S,CU),END

(36)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

36 © Copyright IBM Corporation 2011

(37)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

37 © Copyright IBM Corporation 2011

IPCS OMVSDATA FILE

(38)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

38 © Copyright IBM Corporation 2011

IPCS OMVSDATA FILE

(39)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

39 © Copyright IBM Corporation 2011

IPCS OMVSDATA FILE

(40)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

40 © Copyright IBM Corporation 2011

IPCS OMVSDATA FILE

(41)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

41 © Copyright IBM Corporation 2011

BPXI070E at shut down – Conclusions

File system SYS1.ROOT.MSYX.OMVSSIDA mounted at /MSYX failed to

unmount because of OMVS.ETC.MSYX still mounted at /MSYX/etc

both file systems are owned by system number 02

OMVS.ETC.MSYX failed to unmount because of:

OMVS.CRON.MSYX mounted at /MSYX/etc/cron

OMVS.SPOOL.CRONLOG.MSYX mounted at /MSYX/etc/spool/cron/cronlog

OMVS.SPOOL.MSYX mounted at /MSYX/etc/spool

all 3 file systems are remotely owned by system 04

(42)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

42 © Copyright IBM Corporation 2011

IPCS BPXWNXMB

Formats out the NXMB – control block which represents the

OMVS XCF group members table

Checks if the system is a member of a shared file system

environment

Gives back information about all members, their state, system

name and number as well as the active BPXMCDS couple data

set definitions

(43)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

43 © Copyright IBM Corporation 2011

(44)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

44 © Copyright IBM Corporation 2011

(45)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

45 © Copyright IBM Corporation 2011

BPXI070E at shut down – Conclusions

File systems:

OMVS.CRON.MSYX mounted at /MSYX/etc/cron

OMVS.SPOOL.CRONLOG.MSYX mounted at /MSYX/etc/spool/cron/cronlog

OMVS.SPOOL.MSYX mounted at /MSYX/etc/spool

are remotely owned by system MSYS while their parent file system is owned by

system MSYX. Due to an unknown reason the ownership has changed.

Questions:

When has the change occurred?

(46)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

46 © Copyright IBM Corporation 2011

BPXI070E at shut down – Conclusions

Answers:

An internal control block contains a time stamp when the owner of the file system

changed the last time.

The slip matched at shut down at 06:57:05.980519 local time.

The last owner change happened at 06:56:48.120832 local time / same day.

These file systems are mounted with AUTOMOVE=Y while the parent is mounted

AUTOMOVE=U.

(47)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

47 © Copyright IBM Corporation 2011

Hiper Apar OA34226

ORPHANED PPRA SIGNAL LATCHES *MASTER* MEMTERM ABEND0C4

BPXPRTRM SYS.BPX.AP00.PRTB1.PPRA.LSN

Shut down of a system (SYS1) in a shared file system environment

Latch contention on a different system (SYS2)

Reinitialization of SYS1 into the shared file system environment

impossible due to latch contention on SYS2

SYS2 performed MemberGoneRecovery for SYS1

contention on the mount latch due to an orphaned PPRA latch

'D OMVS,W' command just shows mount latch activity

(48)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

48 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – D OMVS,W

BPXO063I 01.46.02 DISPLAY OMVS 886

OMVS 0010 ACTIVE OMVS=(A0,00,R0,A1)

MOUNT LATCH ACTIVITY:

USER ASID TCB REASON AGE

---HOLDER:

OMVS 0010 009FC3E8 MemberGone Rcvry 00.00.15

IS DOING: BRLM Wait <--- misleading !

FILE SYSTEM: OESYS.WILY.PRODPLEX.INTRO810.ZFS

WAITER(S):

OMVS 0010 009A0160 FileSys Unmount 00.00.03

(49)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

49 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – D GRS,C

D GRS,C

ISG343I 05.30.00 GRS STATUS

LATCH SET NAME:

SYS.BPX.AP00.PRTB1.PPRA.LSN

CREATOR JOBNAME: OMVS CREATOR ASID: 0010

LATCH NUMBER: 2056

REQUESTOR ASID EXC/SHR OWN/WAIT WORKUNIT TCB ELAPSED

*MASTER* 0001 EXCLUSIVE OWN 009DBE88

Y 16:53:59

OMVS 0010 SHARED WAIT 009FC3E8

Y 03:44:12

LATCH SET NAME:

SYS.BPX.A000.FSLIT.FILESYS.LSN

CREATOR JOBNAME: OMVS CREATOR ASID: 0010

LATCH NUMBER: 2

REQUESTOR ASID EXC/SHR OWN/WAIT WORKUNIT TCB ELAPSED

OMVS 0010 EXCLUSIVE OWN 009FC3E8

Y 03:44:12

OMVS 0010 EXCLUSIVE WAIT 009A0160 Y 03:44:00

OMVS 0010 EXCLUSIVE WAIT 009D04E0 Y 03:38:42

(50)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

50 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – What shows the dump?

IPCS ANALYZE RESOURCE

RESOURCE #0012:

NAME=

SYS.BPX.A000.FSLIT.FILESYS.LSN ASID=0010 Latch#=2

RESOURCE #0012

IS HELD BY:

JOBNAME=OMVS ASID=0010 TCB=009FC3E8

DATA=EXCLUSIVE RETADDR=BD24A324 REQID=001000003D011540

RESOURCE #0012 IS REQUIRED BY:

JOBNAME=OMVS ASID=0010 TCB=009A0160

DATA=EXCLUSIVE RETADDR=BD28CD70 REQID=001000001976B8D0

(51)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

51 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – What shows the dump?

IPCS ANALYZE RESOURCE (cont.)

RESOURCE #0011:

NAME=

SYS.BPX.AP00.PRTB1.PPRA.LSN ASID=0010 Latch#=2056

RESOURCE #0011

IS HELD BY:

JOBNAME=*MASTER* ASID=0001 TCB=009DBE88

DATA=EXCLUSIVE RETADDR=BD421A06 REQID=01E4080841AE2300

RESOURCE #0011

IS REQUIRED BY:

JOBNAME=OMVS ASID=0010 TCB=009FC3E8

(52)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

52 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – What shows the dump?

latch is represented by a LQE (Latch Queue Element) within a

latch set (LSET).

LSET and LQE live in the creators private storage (OMVS)

LQE contains a time stamp when the latch was obtained

(53)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

53 © Copyright IBM Corporation 2011

(54)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

54 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – What shows the dump?

IPCS LTOD – formats out TOD (time of day) stamps

IPCS LTOD C7486B4077C11780

Shows, the latch was obtained on 4

th

of February 2011, while the contention was reported

(55)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

55 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – What shows the dump?

CTRACE COMP(SYSOMVS) LOCAL FULL

OPTIONS((EXCEPTION))

gathers exceptional information that are written to a different

ctrace buffer

OMVS ctrace does not need to be switched on

shows that the TCB in MASTER address space abended at the

time when the latch was obtained.

OMVS recovery routines did not release the latch – latch got

into an orphaned state

(56)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

56 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – What shows the dump?

CTRACE COMP(SYSOMVS) LOCAL FULL OPTIONS((EXCEPTION))

F '02/04/2011'

(57)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

57 © Copyright IBM Corporation 2011

Hiper Apar OA34226 – Conclusions

USS recovery routine BPXPRTRM was redesigned to ensure latches are

released if itself abends during recovery / memory / process termination

a dump is always necessary to decide whether the latch is orphaned

a latch purge tool is available and can be sent out on demand

can avoid an ipl

CALLRTM can be tried as well

Cannot be made available in general because of data integrity reasons

new message BPXM123E is issued if a latch is held by a single task for more

than 5 minutes (starting with z/OS 1.12)

would in this special case point to the PPRA latch held before the

(58)

z/OS Unix System Services Dump Debugging | 15. Mär 2011

58 © Copyright IBM Corporation 2011

Almost done ...

Any wishes with regards to topics for the next guide?

Any concerns / questions ?

References

Related documents