• No results found

A probability programming language: Development and applications

N/A
N/A
Protected

Academic year: 2021

Share "A probability programming language: Development and applications"

Copied!
208
0
0

Loading.... (view fulltext now)

Full text

(1)

W&M ScholarWorks

W&M ScholarWorks

Dissertations, Theses, and Masters Projects Theses, Dissertations, & Master Projects

1998

A probability programming language: Development and

A probability programming language: Development and

applications

applications

Andrew Gordon Glen

College of William & Mary - Arts & Sciences

Follow this and additional works at: https://scholarworks.wm.edu/etd

Part of the Computer Sciences Commons, and the Statistics and Probability Commons

Recommended Citation Recommended Citation

Glen, Andrew Gordon, "A probability programming language: Development and applications" (1998). Dissertations, Theses, and Masters Projects. Paper 1539623920.

https://dx.doi.org/doi:10.21220/s2-1tqv-w897

(2)

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI

films the text directly from the original or copy submitted. Thus, some

thesis and dissertation copies are in typewriter face, while others may be

from any type o f computer printer.

The quality of this reproduction is dependent upon the quality of the

copy submitted. Broken or indistinct print, colored or poor quality

illustrations and photographs, print bleedthrough, substandard margins,

and improper alignment can adversely afreet reproduction.

In the unlikely event that the author did not send UMI a complete

manuscript and there are missing pages, these will be noted. Also, if

unauthorized copyright material had to be removed, a note will indicate

the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by

sectioning the original, beginning at the upper left-hand comer and

continuing from left to right in equal sections with small overlaps. Each

original is also photographed in one exposure and is included in reduced

form at the back of the book.

Photographs included in the original manuscript have been reproduced

xerographically in this copy. Higher quality 6” x 9” black and white

photographic prints are available for any photographs or illustrations

appearing in this copy for an additional charge. Contact UMI directly to

order.

(3)
(4)

A P r o b a b ility P r o g r a m m in g L an gu age:

D e v e lo p m e n t a n d A p p lic a tio n s

A D isse rtatio n

P resen ted to

T h e F a cu lty of th e D e p a rtm e n t of A p p lie d Science

T h e College of W illiam M a ry in V irginia

In P a rtia l Fulfillm ent

O f th e R eq u ire m e n ts for th e D egree of

D octor of P h ilo s o p h y

by

A ndrew G. G len

(5)

UMI Number: 9904264

C o p y r i g h t 1 9 9 9 b y G l e n , A n d r e w G o r d o n

All rights reserved.

UMI Microform 9904264

Copyright 1998, by UMI Company. All rights reserved. This microform edition is protected against unauthorized

copying under Title 17, United States Code.

UMI

300 North Zeeb Road Ann Arbor, MI 48103

(6)

A P P R O V A L S H E E T

This D isse rtatio n is s u b m it te d in p a r tia l fulfillment of

th e re q u ire m e n ts for th e Degree of

D o c to r of P hilosophy

A nd rew G. G len

A P P R O V E D . J a n u a r y 1998

Low

'Mad

I

Law rence M. Leemis, D isse rtatio n Advisor

/L S

/

A

. J S A n / V . . X S . ^ J o h n H. Drew S id n ey H. Law rence Rex K. K incaid / \ AA^ D onald R. B arr, O u ts id e E x a m in e r

(7)

C o n ten ts

1

I n tro d u c tio n

2

1.1 G eneral ... 2 1.2 L i te r a tu re r e v i e w ... 6 1.3 O u tlin e of th e d i s s e r t a t i o n ... 7 1.4 N o ta tio n a n d n o m e n c la tu re ... S

2

S oftw are D e v e lo p m e n t

10

2.1 T h e c o m m o n d a t a s t r u c t u r e ... 13

2.2 C o m m o n con tin u o u s, u n iv a riate d i s t r i b u t i o n s ... 16

2.3 T h e six re p re s e n ta tio n s of d i s t r i b u t i o n s ... IS 2.4 V e r i f y P D F ... 21 2.5 M e d i a n R V... 24 2.6 Dis playRV ... 25 2.7 P l o t D i s t ... 26 2.S E x p e c t a t i o n R V... 28 2.9 T r a n s f o r m ... 29

(8)

2.10 O r d e r S t a t ... 31 2.11 P r o d u ct R V a n d P r o d u c t l l D ... 33 2.12 SumRV a n d S u m l l D ... 36 2.13 MinimumRV... 3S 2.14 MaximumRV... 39 2.15 M a x im u m likelihood e s t i m a t i o n ... 41

3 T ra n sfo rm a tio n s o f U n iv a ria te R an d o m V a ria b les

44

3.1 I n t r o d u c t i o n ... 44 3.2 T h e o r e m ... 47 3.3 I m p l e m e n t a t i o n ... 50 3.4 E x a m p le s ... 53 3.5 C o n c l u s i o n ... 59

4

P ro d u c ts o f R a n d o m V ariab les

61

4.1 I n t r o d u c t i o n ... 61 4.2 T h e o r e m ... 62 4.3 I m p l e m e n t a t i o n ... 65 4.4 E x a m p le s ... 68 4.5 C o n c l u s i o n ... 75

5

C o m p u tin g th e C D F o f th e K o lm o g o ro v -S m irn o v T est S ta tis tic

76

5.1 I n t r o d u c t i o n ... 76

(9)

•5.3 C o m p u tin g th e d is tr i b u tio n of D n ... 79

5.3.1 P h a s e 1: P a r t i t i o n th e s u p p o r t of D n — ^ ... S2 5.3.2 P h a s e 2: Define th e A m a t r i c e s ... S3 5.3.3 P h ase 3: Set lim its on th e a p p r o p r ia te i n t e g r a l s ... S9 5.3.4 P h ase 4: Shift th e d i s t r i b u t i o n ... 96

5.4 C ritic a l values a n d significance l e v e l s ... 99

5.5 C o n c l u s i o n ... 100

6 G o o d n e ss o f F it u sin g O rd e r S ta tis tic s

102

6.1 I n t r o d u c t i o n ... 102 6.2 T h e V - v e c t o v... 104 6.3 Im p ro v in g c o m p u ta ti o n of V ... 106 6.4 Goodness-of-fit t e s t i n g ... 109 6.5 C o n c l u s i o n ... 113

7 O th e r A p p lic a tio n s a n d E x am p le s

115

7.1 I n t r o d u c t i o n ... 115 7.2 E x a c tn e s s in lieu of CLT a p p r o x i m a t i o n s ... 116 7.3 A m a t h e m a t i c a l resource g e n e r a t o r ... 121

7.4 P ro b a b ilis tic m ode l design: reliability block d i a g r a m s ... 125

7.5 M o deling w ith h a z a rd f u n c t i o n s ... 127

7.6 O u tli e r d e t e c t i o n ... 132

(10)

8 C onclusion a n d F u rth e r W o rk

140

A T h e A rc ta n g e n t Survival D is trib u tio n

143

A .l I n t r o d u c t i o n ... 143

A .2 D evelopm ent ... 145

A .3 P ro b a b ilis tic p r o p e r t i e s ... 14S A.4 S ta tis tic a l i n f e r e n c e ... 150

A.5 C o n c l u s i o n ... 15S

B C o n tin u o u s D istrib u tio n s

159

C A lg o rith m for 6 x 6 C onversions

161

D A lg o rith m s for V arious P ro c e d u re s

163

D .l A lg o rith m for V e r i f y P D F ... 163 D.2 A lg o rith m for E x p e c t a t i o n R V ... 164 D.3 A lg o rith m for O r d e r S t a t ... 165 D.4 A lg o rith m for P r o d l l D ... 166 D.5 A lg o rith m for S u m R V ... 167 D.6 A lg o rith m for S u m l l D ... 168

D.T A lg o rith m for MinimumRV... 169

D.S A lg o rith m for MaximumRV... 170

D.9 A lg o rith m for MLE... 171

(11)

F A lg o rith m fo r

ProductRV

174

G

A lg o rith m fo r

KSRV

180

H S im u la tio n C o d e for M LEO S A p p ro x im a tio n s

184

B ib lio g rap h y

187

(12)

A ck n o w led g em en t

I m u s t g ra te fu lly acknow ledge t h e assistan ce of m a n y p e o p le w ho have h elp ed in

very m a n y ways in th e p r o d u c tio n of th is d is s e rta tio n a n d re search . Professors H a n k

Krieger. M a r in a K o n d ra to v itc h gave help in specific p a r t s of th is research. A special

no te of p erso n al th a n k s goes to P rofessors Donald B a rr, J o h n D rew , a n d especially

Larry Leemis. who were tr e m e n d o u s ly p a tie n t, insig h tfu l, a n d s u p p o r tiv e of t h e re­

search. I t h a n k for th e ir p a tie n c e a n d su p p o rt m y wife, Lisa G le n , a n d m y children

A n d rea. R eb ec ca, and Mary. I also th a n k and praise a l m ig h t y G o d , for it is only

by His d iv in e will t h a t we are privileged to u n d e r s ta n d w h a t we hav e learned in this

(13)

»o

»

o

L ist o f F igu res

3.1 T h e tra n s fo rm a tio n Y = <7(A") = X 2 for — 1 < A" < 2 ... 54

3.2 T h e tra n s fo rm a tio n Y = g [ X ) = ||A’ — 3| — 1| for 0 < X < 7 ... 55

3.3 T h e tra n s f o r m a tio n Y = g ( X ) has a d iscontinuity a n d is variously I —to—1 a n d 2 - t o - l on different s u b s e ts of th e s u p p o r t of A"... 57

3.4 T h e tra n s fo r m a tio n Y = <7(A") = sin2(.A) for 0 < A’ < 2 ~ ... 58

4.1 T h e s u p p o r t of X an d Y w h en ad < be... 64

4.2 T h e m a p p in g of Z = X a n d 1’ = AT" when ad < be... 65

4.3 T h e P D F of I ' = A T ' for A" ~ N ( 0 . 1) and Y ~ N ( 0 . 1)... 72

.1 T h e C D F of th e D$ r a n d o m v a ria b le ... 98

.2 T h e P D F of th e £>6 r a n d o m variable. N ote th e d is c o n tin u ity a t y = 1/6. 99

6.1 T ra n s fo rm a tio n s from iid o b s e rv a tio n s A'i, A ^ , . . . ,A 'n to t h e s o rte d

■P-vector elem ents F(i). P(2), • • • , P(n)...

6.2 E s ti m a t e d power fu n ctio n s for te s t in g Ho'. X ~ N ( 0 , 1) versus Hi:

X ~ N(0.<72) using K - S , A - D , a n d two s ta tistic s b ased on t h e

(14)

7.1 O v erlaid plots of f z - { x ) a n d t h e s t a n d a r d n o rm al P D F ... 120

7.2 O v e rla id plots of th e s t a n d a r d n o rm a l a n d s ta n d a r d i z e d IG(O.S) d is tr i­

b u t i o n s ... 123

7.3 R B D of a co m p u te r s y s te m w ith tw o processors a n d t h r e e m e m o ry units. 126

7.4 T h e S F of th e hypothesized B T - s h a p e d h a z a rd fu n c tio n fit to th e s a m ­

ple [1. 11. 14. 16. 17] overlaid on t h e e m p irica l S F ... 129

7.5 T h e P D F of th e d is tr ib u tio n h a v in g perio d ic h a z a r d fu n c tio n k \ w ith

p a r a m e te r s a = 1. 6 = 0.5 a n d c = 10... 131

7.6 T h e P D F s of th e four o rd e r s ta t is tic s from an e x p o n e n tia l d is trib u tio n . 13S

A .l E x a m p le s of th e a rc ta n g e n t p r o b a b il it y d en sity f u n c ti o n ... 147

A .2 E m p ir ic a l, fitted a rc ta n g e n t, a n d fitted Weibull s u rv iv o r functions for

t h e ball b e a rin g lifetim es... 153

(15)

List o f T ables

5.1 C o m p u t a t i o n a l requirem ents for c o m p u tin g th e Dn C D F for sm all n.

5.2 C o m p u t a t i o n a l efficiency associated w ith using th e F a n d V a rray s. .

5.3 C D F s of D n — ^ for n = 1. 2 , . . . . 6 ...

6.1 E s ti m a t e d critica l values for P, a t various sam ple sizes a n d levels of

sig nificance...

7.1 F ractiles for e x a c t and a p p ro x im a te d d is tr ib u tio n s ...

7.2 P ( A (6) > 10) for n = 6 for several p o p u la tio n d is tr i b u tio n s ...

7.3 T h e M S E s of th e M LE and M LEO S an d ad ju sted-for-bias M L E te c h ­

niques of p a r a m e t e r e s tim a tio n ...

A .l K o lm o g o ro v -S m irn o v Goodness-of-fit S ta tis tic s for th e Ball B earin g

D a t a ...

B .l C o n tin u o u s d is trib u tio n s of ra n d o m variables available in A P P L . . . .

(16)

A b stra ct

A p ro b ab ility p ro g ra m m in g la n g u a g e is developed a n d p r e s e n te d : a p p lic a tio n s illus­ t r a t e its use. A lg o rith m s a n d g e n e ra liz e d th e o re m s u sed in p r o b a b i l i t y a re e n c a p s u ­ lated into a p ro g ra m m in g e n v i r o n m e n t w ith th e c o m p u te r a l g e b r a s y s t e m M a p le to provide th e ap p lied c o m m u n it y w ith a u t o m a t e d p r o b a b ility c a p a b ili tie s . A lg o rith m s of procedures are p re s e n te d a n d e x p la in e d , inclu d in g d e ta ile d p r e s e n t a t i o n s on th r e e of th e m ost significant p ro ced u res. A p p lic a tio n s t h a t e n c o m p a s s a w id e ra n g e of ap p lied topics including goodness-of-fit te s t in g , p ro babilistic m o d e lin g , c e n t r a l lim it th e o re m a u g m e n ta tio n , g e n e ra tio n of m a t h e m a t i c a l resources, a n d e s t i m a t i o n a r e p re s e n te d .

(17)

A P r o b a b ility P r o g r a m m in g L a n g u a g e:

D e v e lo p m e n t a n d A p p lic a tio n s

(18)

C h ap ter 1

In tr o d u ctio n

1.1

G e n e r a l

P ro b a b ility theory, as it exists today, is a vast collection of axio m s a n d th e o re m s th a t,

in essence, provides th e scientific co m m u n ity m any c o n trib u tio n s , including:

• th e n a m in g a n d d e s c rip tio n of ra n d o m variables t h a t o c c u r fre q u e n tly in ap p li­

cations.

• th e th e o re tic a l resu lts asso c ia te d with these r a n d o m variables, an d ,

• th e a p p lie d results a s s o c ia te d w ith these ra n d o m v aria b les for s ta tis tic a l ap p li­

cations.

No one volum e categorizes its work in exac tly th e se t h r e e ways, b u t th e l i t e r a t u r e ’s

com p re h en siv e works ac c o m p lish th e se goals. W h e t h e r v o lu m in o u s , such as th e work

of Joh n so n , K otz. an d B a la k ris h n a n (1995), or succinc t, su c h as t h a t of Evans,

(19)

3

ings, an d P eacock (1993). one finds all th r e e of th e se areas p r e s e n te d in c h a p t e r s th a t

are organized on t h e first co n trib u tio n above, n a m in g a n d d e s c rib in g t h e r a n d o m vari­

ables. W orks such as Hogg an d C raig (1995), P o r t (1994). a n d D a v id (19S1) organize

their efforts ac c o rd in g to t h e second c o n trib u tio n , covering th e o re tic a l re s u lts th a t

apply to r a n d o m variables. T h e n th e re a re t h e works such as Law a n d K e lto n (1991),

Lehm ann (19S6). a n d D ’A gostino a n d S te p h e n s (1986) who c o n c e n t r a t e on th e s ta­

tistical a p p licatio n s of ra n d o m variables, a n d ta ilo r th e ir e x p l a n a ti o n s of prob ab ility

th e o ry to th e p o rtio n s of th e field t h a t hav e a p p lic a tio n in s t a t i s t i c a l analysis.

In all these works, as well as countless o th e rs , one s ta rk om issio n is a p p a re n t.

T h e re is no m e n tio n of an ability to a u t o m a t e t h e n am in g , p ro cessin g , o r a p p licatio n

of ra n d o m variables. T h is omission is even m o re profound w h e n o n e c o n sid ers the

tedious n a t u re of t h e m a th e m a tic s involved in t h e ex e c u tio n o f m a n y of th e se results

for all b u t th e sim p le s t of exam ples. In p ra c tic e , t h e level of t e d i u m m a k es the

a c tu al execu tio n u n te n a b le for m any r a n d o m variables. A u to m a tio n of c e r t a in types

of these p rocedures could e ra d ic a te th is te d iu m . T h e r e is an a b u n d a n c e of s ta tis tic a l

softw are packages t h a t give th e scientific c o m m u n it y powerful tools to a p p l y s ta tis tic a l

procedures. B u t to d a te , th e re is no package t h a t a t t e m p t s to a u t o m a t e t h e more

th e o re tic al side of t h e p ro b a b ilis t's work. E ven t h e sim p lest of ta sk s , p lo t tin g a fully

specified p ro b a b ility d e n s ity function, is n o t p rovided in m a n y s t a t i s t i c a l packages.

In o rd e r to plot new . ad-hoc densities or C D F s . one is often r e q u ir e d t o w rite an

(20)

4

A c o n c e p tu a l p ro b a b il it y softw are package is now p r e s e n te d t h a t begins to fill this

existii g gap. Before o u tl in in g th is w ork's a p p ro a c h , let us p re s e n t s o m e exam p les to

illu s tra te w h a t sh ould b e in c lu d e d in such a p ro b a b ility s o ftw are package.

C o n s id e r th e following i n d e p e n d e n t r a n d o m variables: W ~ g am m afA . k), Z ~

N(/x,cr), Y ~ Weibullf A. k ). A’ a n d R ~ arctan(<z>. a ) , a n d D . T . U . a n d V as s p e c ­

ified in th e q u estio n s below . See A p p e n d ix A for m o re in f o rm a tio n on th e a r c t a n

d is tr ib u tio n . • W h a t is th e d is tr i b u tio n of V’ = W + X + K? • W h a t is t h e d i s t r i b u t i o n of T = X • l n ( W 2) + eYZ? • W h a t is th e d i s t r i b u t i o n of a r a n d o m d is ta n c e D, w hich is t h e su m of th e p r o d u c t of r a n d o m ra te s R i . R2 R n a n d ra n d o m ti m e s T1. T2 Tn. i.e., D = Ri • T\ + /?2 • T2 + • • • + R n

TvZ-• W h a t is th e d is tr i b u tio n of th e s y s te m lifetim e U in a re lia b ility block d ia g ra m

c o n ta in in g two p a ra lle l blocks of tw o s u b s y s te m s t h a t con sist of tw o co m p o n e n ts

in series, i.e.. U = m a x { m in { V ’. W } , m in{V ’. Z } } ?

• W h a t is th e exact u p p e r ta il p ro b a b ility for th e s t a t i s t i c 6.124 a sso ciated w ith

t h e d is tr i b u tio n of t h e 4th o rd e r s t a t i s t i c o u t of a s a m p l e of 12 iid observations

t h a t h a v e th e s a m e d is tr i b u tio n as U?

(21)

5

• How could one e m p lo y th e P D F s of o rd e r s ta tis tic s of a p o p u la tio n , in s te a d of

th e P D F of t h e p o p u la tio n itself, to develop an a l t e r n a t e a p p ro a c h to m a x im u m

likelihood e s tim a tio n ?

• W h a t is th e d is tr i b u tio n of th e m a x im u m likelihood e s t i m a t o r of th e inverse

G a u s sia n r a n d o m v a ria b le's first p a r a m e te r /j. w hen a s a m p le size of n is sp eci­

fied?

• M ost im p o r ta n tly , if o n e could find th e se answ ers, w h a t is th e ir u tility to th e

s ta tis tic a l a n d a p p lie d science co m m u n ity ?

T h e r e is no im p lic a tio n t h a t t h e previously cited a u th o rs a re rem iss in n e g lect­

ing th e a u t o m a t i o n of p r o b a b ility softw are. In fact it is only w ith th e advent a n d

m a tu rin g of c o m p u te r a lg e b r a sy ste m s , such as M aple an d M a th e m a ti c a , th a t th e re

now exists th e a b ility to a u t o m a t e p ro b ab ilistic m o deling an d research. This d o c ­

toral research a n d d is s e rta tio n will ta k e a d v a n ta g e of this rela tiv e ly new technology

by d eveloping a n d p re s e n tin g a softw are "engine" t h a t c o n t ri b u te s to th e fields of

probab ilistic m o d e lin g a n d s ta tis tic a l applications. T his research has c o n c e n tra te d

on p ro c ed u res in th e sy m b o lic lan g u ag e M aple V.

T h e specific c o n trib u tio n s to th e a p p lie d p ro b ab ility a n d s ta t is tic s c o m m u n ity of

this research a n d d is s e rta tio n inc lu d e t h e following:

1. D e ta ile d alg o rith m s t h a t co m p ris e t h e con ce p tu al softw are.

2. G en e ra liz e d versions of th e o re m s t h a t com prise th e softw are. [Note th a t while

(22)

6

a p p e a r difficult to im p le m e n t in an a u t o m a te d en v iro n m en t.]

3. An a l g o ri th m t h a t p roduces t h e exact C D F of t h e K olm ogo ro v -S m irn o v test

s ta tis tic .

4. D etailed e x p la n a tio n s a n d e x am p les of th e s o ftw are's cap abilities.

5. A p p lic a tio n ex am ples th a t c o n trib u te , on th e ir own. to various areas w ithin

p ro b a b ility a n d statis tic s.

6. E x p lo r a tiv e exam p les of probabilistic q uests t h a t a p p e a r to be difficult to carry­

out w ith o u t a u to m a tio n .

7. E x ten sio n s of te stin g s ta tis tic a l hypotheses, to in c lu d e specific c o n trib u tio n s in

th e area s of outlier d e te c tio n , goodness-of-fit. a n d p a r a m e t e r estim atio n .

S. D e m o n s tr a tio n s in th e g eneral area of p ro b a b ilis tic m odel design, to include

specific c o n trib u tio n s in th e a rea s of survival d is tr ib u tio n s , reliability block di­

agram s. e x a c t solutions to c e n tra l limit th e o re m (C L T ) applications, and e s ti­

m a tio n .

1.2

L ite r a tu r e r e v ie w

W hile th e c u r r e n t lite r a tu r e will be reviewed th r o u g h o u t th e d is s e rta tio n , th e re are

a num b e r of works th a t should be m e n tio n ed for th e i r gen era l ap p licab ility to this

research. T h e s e include th e works of Johnson, K otz, a n d B a la k ris h n a n (1995), Leemis

(23)

th e o ry b e h in d th e a lg o rith m s a n d im p le m e n ta tio n . T h e review of t h e l i t e r a t u r e has

d iscovered no p u b lic a tio n on im p le m e n tin g p ro b a b ilis tic p rocedures in c o m p u t e r a l­

g e b r a la nguages, nor on th e benefits of such a p a ra d ig m . D a ta b a se s sea rc h e d in c lu d e

F ir s tS e a rc h , IN N O P A C . I N S P E C . D T IC , N T I C . Science C ita tio n Index. L ib ra ry of

C ongress. Swem L ibrary at T h e College of W i lli a m k M ary, an d th e U S M A lib r a ry

a t W est P o in t. NY. Search s trin g s in c lu d ed t h e following in d iv id u a l s u b je c t are a s a n d

pairs of s u b je c t areas, w here a p p ro p r ia te : d is tr i b u tio n s , goodness-of-fit. life te s tin g ,

M aple, m odeling, o rd er s ta tis tic s , p robability, reliability, a n d sym bolic algebra. W h ile

th e re a re m a n y listings u n d e r th e se te rm s a n d pairings of these te rm s , no w ork was

found a b o u t com bining p ro b a b ilis tic results w ith c o m p u te r a lg e b ra i m p le m e n ta t io n .

T h e n e g a tiv e result of this search in dicates t h a t th e re is a lack of archival m a te r ia l

on th e s u b je c t .

1.3

O u tlin e o f t h e d is s e r ta tio n

T h is d is s e r ta tio n is p re sen te d acc o rd in g to t h e following o utline. In C h a p t e r 2 th e

d e v e lo p m e n t, abilities, a n d e x a m p le s of use o f th e softw are la nguage a re p r e s e n te d .

C h a p t e r 3 co n ta in s th e d e v e lo p m e n t of th e p r o c e d u r e t h a t a c c o m m o d a te s t r a n s f o r m a ­

tions of r a n d o m variables to in c lu d e a r e - s ta t e d , general, im p le m e n ta b le t h e o re m for

such work. In C h a p t e r 4 a p ro c e d u re for finding t h e d is tr ib u tio n of t h e p r o d u c t of tw o

in d e p e n d e n t co ntinuous r a n d o m variables is p r e s e n te d . In C h a p t e r 5 a p r o c e d u r e t h a t

(24)

s

specified sa m p le size, is p re s e n te d . C h a p t e r 6 co n tain s an ap p lic a tio n in w h ic h a new

goodness-of-fit te st p r o c e d u r e is p re s e n te d and te s te d using th e softw are. C h a p t e r 7

is a collectio n of e x a m p le s of e x p lo ra tio n s in th e fields of p ro b a b ility a n d s ta t is tic s

t h a t a r e now possible d u e to t h e softwaxe. Finally, in C h a p t e r S. c o n clu sio n s an d

s u g g e s tio n s of fu r th e r work a re given. In th e a p p en d ices are listed t h e a l g o r i th m s for

t h e so ftw a re , as well as d o c u m e n t a t i o n of th e early work in c re a tin g new p r o b a b ilis tic

m o d e ls.

1 .4

N o t a t io n a n d n o m e n c la tu r e

T h is s e c tio n reviews c e r t a in n o ta tio n a n d n o m e n c la tu re used here. Use is m a d e of

t h e following a cro n y m s a n d fu n c tio n a l n o ta tio n for d en sity re p re s e n ta tio n s :

• p ro b a b ility d e n s ity fu n c tio n ( P D F ) f x i * )

-• c u m u la tiv e d is tr i b u tio n fu n ctio n (C D F ) F.y(x) = f x { * ) ds,

• s u rv iv o r function (S F ) 5.\'(-r) = I — F.v(x).

• h a z a r d fu nction ( H F ) h. \{x) =

• c u m u la tiv e h a z a rd fu n c tio n (C H F ) Hx [ x ) = j l ^ h x i s ) ds, an d

• in v e rse d is tr ib u tio n f u n c tio n ( ID F ) F ^ l (x).

T h r o u g h o u t th e d is s e r ta tio n , th e pro p o sed softw are is referred to as “a p r o b a b i l i t y

(25)

9

a re used to refer to P D F s (an d o th e r functions) t h a t can be c o n s tr u c t e d by piecing

to g e th e r various s t a n d a r d functions, such as p olynom ials, lo g a rith m s, e x p o n en tials,

a n d trig o n o m e tric fu n c tio n s , e.g., th e tr i a n g u la r ( l , 2. 3) d i s t r i b u t i o n which has two

s eg m en ts o r tw o pieces, each of which are linear fu nctions. T h e c o m m o n a b b r e v i­

a tio n “N{fj., <r)’’ is u sed to refer to th e norm al d is trib u tio n . N o te t h a t th e second

p a r a m e te r is th e s t a n d a r d d e v iatio n , not th e variance. Also, “ 11(0. 6)” is used to rep ­

resent th e unifo rm d is tr i b u tio n w ith paxam eters a a n d 6. S u b s c rip ts in p aren th eses

re p re s e n t o rd e r s t a t is tic s , e.g. t h e r th o rder s ta tis tic a sso ciated w ith a ra n d o m sa m p le

A 'i,A '2 X n is d e n o t e d by A"(r ). T h e a b b re v ia tio n “iid ” is u sed to d enote in d e ­

p e n d e n t a n d id e n tic a lly d is tr i b u te d ra n d o m variables. T h e t e r m s “fully-specified,”

“sem i-specified," a n d “unspecified’’ are used to d escrib e th e d e g re e to which p a r a m ­

e te rs are specified as c o n s ta n ts or fixed p a ra m e te rs in a d i s t r i b u tio n . For e x a m p le ,

th e ex p o n e n tia l! 1) d i s tr i b u tio n is a fully specified d is tr ib u tio n . T h e W e ib u ll(l, k)

a n d th e N(0. a) a re b o th semi-specified d is trib u tio n s . T h e tr i a n g u la r ( a . b. c) a n d

exponential!A ) d is tr i b u tio n s are b o th unspecified. T y p e w r it e r font is used to r e p ­

resent M aple la n g u a g e s ta t e m e n t s . For exam p le “> X := U n ifo r m R V (0 , 1 ) ; " is a

M ap le a s sig n m e n t s t a t e m e n t . N ote th a t the sym bol “>” re p re s e n ts th e M aple in p u t

(26)

C h ap ter 2

Softw are D ev e lo p m en t

T h e n o tio n of p ro b ab ility softw are is different from th e notion of a p p lie d s ta tis tic a l

softw are. P ro b ab ility th e o ry is rife w ith th eorem s a n d c a lc u la tio n s t h a t re q u ire s y m ­

bolic. algebraic m a n ip u la tio n s . A pplied s ta tis tic a l calcula tions a r e u su ally num e ric

m a n ip u la tio n s of d a t a based on know n formulas associated w ith d is t r i b u t i o n s of c o m ­

m on ra n d o m variables. T h is section contains a discussion on several a lg o rith m s t h a t

c o n t ri b u te to th e d e v elo p m en t of A P P L . A vailability of c o m p u te r a lg e b r a system s

s uch as M ap le and M a th m a t i c a facilitate th e d e v elo p m en t of s o ftw are t h a t will derive

fu nctions, as opposed to c o m p u tin g num bers.

P r o b a b ility softw are m u s t, a t th e m ost basic level, be a m e an s o f p r o d u c in g dis­

tr i b u tio n s of ra n d o m variables. A t th e h eart of th e softw are m u s t re s id e a n “en g in e ”

t h a t c an c o m p u te new. useful re p re sen tatio n s of d is trib u tio n s .

T h e derivation of e x a c t d is tr ib u tio n functions of com plex r a n d o m v aria b les is often

u n te n a b le . In such cases, one h a d to be co n ten t w ith a p p r o x im a tio n s a n d s u m m a r y

(27)

11

s ta t is tic s of th e unk n o w n d is trib u tio n s , reg a rd le s s of w h e th e r those a p p r o x im a ti o n s

a n d s u m m a rie s were a d e q u a te . For e x a m p le , o n e often a p p ro x im a te s d is tr i b u tio n s u s­

ing M o n te C arlo s im u latio n or by invoking t h e c e n tra l lim it theorem . S ta t is tic s such

as t h e sa m p le m ean a n d v aria nce of th e a p p r o x i m a t e d d is trib u tio n are th e n r e p o r te d .

If w h a t was really needed was a c e rta in p e r c e n til e of t h e a p p ro x im a te d d i s t r i b u t i o n ,

often tim e s th e en tire sim u la tio n would need to be re m o d e le d , re-validated, re-verified,

a n d re-ru n to o b ta in th e n eed e d in fo rm a tio n . A re su lt su ch as a fully-specified P D F

w ould e ra d ic a te th e need for such r e d u n d a n t efforts. O n e also would have a n a l y t ­

ical resu lts to rep resent ch a ra c te ristic s of c e r t a i n co m p lex ra n d o m variables w hose

fully-specified functions are u n te n a b le . For e x a m p le , renew al th e o ry a n d c o m p o u n d

P oisson process th e o ry have results t h a t d e r iv e th e m e a n a n d variance of c o m p le x dis­

trib u tio n s . b u t fall sh o rt of a c tu a lly d e t e r m in i n g th e e n tir e r e p re s e n ta tio n of c o m p le x

d is tr i b u tio n s via a P D F . C D F . or some o t h e r form of t h e d is trib u tio n . T h e p ro p o s e d

p ro b a b ilis tic software is designed to m ake a b r e a k t h r o u g h into th e a re a of c o m p le te ly

d e s c rib in g com plex d is trib u tio n s w ith P D F s . C D F s . a n d th e like, th e re b y p r o v id in g

in c re ased m odeling c a p a b ility to th e a n a ly s t.

A t th e m o st general level, one could a t t e m p t to find d is trib u tio n s of in t r ic a t e t r a n s ­

fo rm a tio n s of m u ltiv a ria te , d e p e n d e n t r a n d o m variables. T h e software d e s c rib e d here

is lim ite d to u nivariate, con tin u o u s, i n d e p e n d e n t r a n d o m variables, a n d t h e c o m p le x

tr a n s f o r m a tio n s a n d co m b in a tio n s t h a t can r e s u lt b e tw e e n in d e p e n d e n t r a n d o m v a ri­

ables. A set of a lg o rith m s t h a t derives f u n c tio n a l re p re s e n ta tio n s o f d is t r i b u t i o n s

(28)

12

s en ted . Specifically, a lg o rith m s have been d ev e lo p e d t h a t will c o n d u c t t h e following

op e ra tio n s :

• s u p p ly a c o m m o n d a t a s t r u c t u r e for th e d is t r i b u t i o n s of co n tin u o u s , u n iv a ri­

ate . ra n d o m varia b les— in c lu d in g d is tr i b u tio n fu n c tio n s t h a t m a y b e defined

piecew ise, e.g. t h e tr i a n g u la r d is trib u tio n .

• c o n v e rt any fu n c tio n a l r e p r e s e n ta tio n of a r a n d o m variable into a n o t h e r fu n c ­

tio n a l re p r e s e n ta tio n using t h e com m on d a t a s t r u c t u r e , i.e. allowing co nversion

a m o n g s t th e P D F . C D F . S F . H F . C H F , a n d ID F ,

• verify th a t th e a r e a u n d e r a c o m p u te d P D F is one,

• prov id e s tr a ig h tfo rw a rd in s ta n tia tio n of w ell-know n d is trib u tio n s , s u c h as th e

ex p o n e n tia l, n o rm a l, u n iform , a n d Weibull d is tr i b u tio n s , with e ith e r n u m e ri c or

sy m b o lic p a ra m e te rs .

• d e t e r m in e th e d is tr i b u tio n of a sim ple t r a n s f o r m a tio n of a c o n tin u o u s r a n d o m

variable. Y = g ( A ')— in c lu d in g piecewise, c o n tin u o u s tra n s fo rm a tio n s ,

• d e t e r m in e co m m o n s u m m a r y ch ara c te ristic s of r a n d o m variables, s u c h as th e

m e a n , variance, o t h e r m o m e n ts , and so fo rth ,

• c a lc u la te th e P D F of su m s of in d e p e n d e n t r a n d o m variables, i.e. Y = X + Z.

• c a lc u la te th e P D F of p r o d u c ts of in d e p e n d e n t r a n d o m variables, i.e. Y = X Z ,

• c a lc u la te th e P D F of th e m i n im u m a n d m a x i m u m of in d e p e n d e n t r a n d o m vari­

(29)

13

• c a lc u la te th e P D F of th e r th o rd e r s ta tis tic from a s a m p le of n iid r a n d o m

variables.

• c a lc u la te probabilities a sso ciated with a ra n d o m variable.

• g e n e r a te ra n d o m variates asso ciated with a r a n d o m variable.

• plot any of th e six fu n c tio n a l forms of any d is tr ib u tio n , e.g. th e H F or C D F .

• p ro v id e basic s ta tistic al abilities, such as m a x im u m likelihood e s ti m a t io n , for

d is trib u tio n s defined on a single segm ent of s u p p o r t,

• c o m p lim e n t th e s tr u c t u r e d p ro g ram m in g language t h a t hosts t h e softw are (in

this case M aple) so th a t all of th e above m e n tio n e d p ro c e d u re s m a y be u sed in

m a th e m a tic a l and c o m p u te r p ro g ra m m in g in t h a t language.

2.1

T h e c o m m o n d a ta str u c tu r e

Im plicit in a pro b ab ility softw are language is a c o m m o n , su ccin c t, in tu itiv e , a n d

m a n ip u la t a b le d a t a s tr u c t u re for describing th e d is tr i b u tio n of a r a n d o m variable.

T h is im plies th e re should be one d a t a s tr u c tu re t h a t ap p lie s to th e C D F . P D F , S F ,

H F . C H F . a n d ID F . T h e c o m m o n d a t a s tr u c tu re used in th is softw are is referred to as

th e “list-of-lists.”1 Specifically, a n y functional r e p r e s e n ta tio n of a ra n d o m v a ria b le is

p re s e n te d in a list th a t co n tain s t h r e e sub-lists, each w ith a specific p u rpose. T h e first

sub-list c o n ta in s th e o rd ered fu n ctio n s t h a t define th e s e g m e n ts of th e d is tr i b u tio n .

(30)

1 4

two linear functions t h a t com prise th e tw o seg m e n ts of its P D F for its first s u b ­

list. Likewise, th e C D F re p re sen tatio n of th e tr ia n g u la r d i s tr i b u tio n w ould have

th e two q u a d r a t ic fu nctions th a t c o m p rise th e two segm ents of its C D F for its first

sub-list. T h e second sub-list is an o r d e re d list of real n u m b e rs t h a t d e lin e a te th e

end points of t h e s egm ents for th e fu n c tio n s in t h e first sub-list. T h e e n d poin t of

each segm ent is a u to m a tic a lly th e s t a r t p o in t of th e succeeding se g m e n t. T h e th ir d

sub-list in d icate s w h a t d is trib u tio n form t h e functions in th e first s u b -list represen t.

T h e first e le m e n t of th e th i r d sub-list is e i t h e r t h e s trin g C o n t i n u o u s for co ntinuous

d is trib u tio n s or D i s c r e t e for discrete d is tr ib u tio n s . T h e second e l e m e n t of t h e th ir d

sub-list shows w hich of t h e 6 functional form s is used in th e first s u b -lis t. T h e s trin g

PDF. for ex a m p le , in dicates th e list-of-lists is c u rre n tly a P D F list-of-lists. Likewise,

CDF indicates t h a t a C D F is being re p re s e n te d .

E xam ples:

• T h e following M aple s ta t e m e n t assigns th e variable X to a list-of-lists t h a t rep ­

resents t h e P D F of a U(0. 1) r a n d o m variable:

> X := [ [ x -> l ] , [ 0 , l ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ;

• T h e tr i a n g u la r d is trib u tio n has a P D F w ith two pieces to its d is tr i b u tio n . T h e

following s t a t e m e n t defines a triangularfO , 1. 2) ra n d o m v a ria b le X as a list-of-

lists:

> X := [ [ x -> x , x -> 2 - x] , [ 0 , 1 , 2 ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ;

(31)

1 5

its h a z a rd fu n ctio n w ith th e s ta t e m e n t :

> X := [ [ x -> 1 / 2 ] , [ 0 , i n f i n i t y ] , [ ' C o n t i n u o u s ' , ' H F ' ] ] ;

• Unspecified p a r a m e te r s can be re p re s e n te d symbolically. A N (0 . 1) r a n d o m

variable X can be defined w ith th e s t a t e m e n t :

> X := [ [ x -> e x p ( - ( x - t h e t a ) 2) / s q r t ( 2 * P i ) ] , [ - i n f i n i t y , i n f i n i t y ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ;

• T h e p a r a m e te r sp ace can be specified by using th e M aple a s s u m e fu n c tio n . C onsider th e r a n d o m variable T w ith H F

for A > 0. T h e r a n d o m variable T c an b e defined by th e s t a t e m e n t s :

> a s s u m e ( l a m b d a > 0 ) ;

> T := [ [ t -> l a m b d a , t -> l a mb da * t ] , [ 0 , 1, i n f i n i t y ] , [ ‘C o n t i n u o u s ' , ' H F ' ] ] ;

• T h e s y n ta x allows for t h e e n d p o in ts of t h e s e g m e n ts associated w ith t h e s u p p o r t

o f th e r a n d o m varia b le to be specified sym bolically. A U (a. b) r a n d o m v a ria b le

X is defined by:

> X := [ [ x -> 1 / (b - a ) ] , [ a , b ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ;

• No error checking is p erfo rm e d w hen a d i s t r i b u t i o n is defined. T h i s m e a n s t h a t 0 < t < 1

A t t > 1

(32)

16

P D F .

> X := [ [ x -> 6 ] , [ 0 , 5 ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ;

S om e e r ro r checking will be perfo rm e d by t h e p ro c e d u re VerifyPDF. w hich is

p re s e n te d in a s u b se q u en t section.

2.2

C o m m o n c o n tin u o u s , u n iv a r ia te d is tr ib u tio n s

S y n t a x : T h e c o m m a n d

> X := RandomVariableNameKVCParameterSequence) ;

assigns to th e variable X a list-of-lists re p re s e n ta tio n of t h e specified r a n d o m variable.

T h e a r g u m e n ts in ParameterSequence may be real, integer, or string (for sy m b o lic

p a ra m e te rs ).

P u r p o s e : In cluded in th e p r o to ty p e softw are is t h e a b ility to i n s ta n tia te c o m m o n

d is trib u tio n s . W h ile th e list-of-lists is a functional form th a t lends itself to t h e m a t h ­

e m a tic s of th e softw are, it is not an in s ta n tly recognizable form for r e p r e s e n t in g a

d is tr ib u tio n . H ere is provided a n u m b e r of sim p le p ro ced u res th a t ta k e re la tiv e ly

co m m o n definitions of d is trib u tio n s a n d convert t h e m to a list-of-lists fo rm a t. T h e

included d is tr ib u tio n s are well-know ones, such as th e n o rm al, Weibull, e x p o n e n t ia l ,

an d g a m m a . A co m p le te list of th e d is tr ib u tio n s p ro v id e d in A P P L , to in c lu d e th e ir

p a r a m e te r s , is p re s e n te d in A p p e n d ix B.

S p e c i a l I s s u e s : T h e suffix RV is a d d e d to each n a m e to m a k e it readily id e n tifia b le

(33)

17

n o r m a l an d gamma. T h e first le tte r of each w o rd is cap italized , w hich is th e case

for all procedures in A P P L . Also, th e re is no s p a c e betw een w ords in a p ro cedure

call, e.g.. an inverse G au ssian ra n d o m variable m a y be defined by t h e c o m m a n d

I n v e r s e G a u s s i a n R V ( P a r a m e f e W . P a ra m e ter^ ). U sually th e fo r m a t is r e t u r n e d as a

P D F . b u t in th e case of th e IDB d is trib u tio n , a C D F is re tu rn e d . T h e C D F of th e

IDB d is trib u tio n , it tu r n s o u t. is easier for M aple to m a n ip u la t e (e.g.. in te g r a te , differ­

e n tia te ) th a n th e P D F . C e r ta in as s u m p tio n s a re m a d e a b o u t u nspecified p a ra m e te rs .

For e x am p le, an assignm ent of an unspecified e x p o n e n tia l r a n d o m varia ble (see th e

second ex a m p le below), will result in th e a s s u m p tio n t h a t A > 0. T h is a s s u m p tio n , as

w ith all o th e r d is trib u tio n s ' a s su m p tio n s, are o n ly a p p lie d to unspecified p a ra m e te rs .

T h e a s s u m p tio n s allow M aple to carry o u t c e r ta in ty p e s of sy m b o lic in te g ra tio n , such

as verifying th e area u n d e r th e d en sity is in fact one. for a P D F (see S ection 2.4).

E x a m p l e s :

• T h e exponential! I ) d is trib u tio n m ay be c r e a te d w ith th e following s ta te m e n t:

> X := E x p o n e n t i a l R V ( 1 ) ;

• T h e exponential(A ) ra n d o m variable A', w here A > 0. d i s t r i b u t i o n m ay be

c re a te d as follows:

> X := E x p o n e n t i a l R V ( l a m b d a ) ;

• T h e s e procedure also allow a m odeler to r e p a r a m e te r iz e a d is tr i b u tio n . T h e

ex p o n e n tia l ( | ) d is trib u tio n w here 8 > 0 , for e x a m p le , m a y b e c r e a te d as follows:

> a s s u m e ( t h e t a > 0 ) ;

(34)

IS

• T h e semi-specified Weibull(A. 1). w h ere A > 0. d is trib u tio n m a y b e c re a te d as

follows:

> X := Wei b u l l R V ( l a m b d a , 1 ) ;

N ote t h a t th is is a special case w h ere th e Weibull d is tr ib u tio n is e q u iv alen t to

an e x p o n e n tia l d is trib u tio n .

• T h e s ta n d a r d n o rm al d is trib u tio n m a y be c rea ted as follows:

> X := N o rm alR V (0, 1 ) ;

All d is trib u tio n s p re sen tly included in A P P L an d their p a r a m e te r i z a ti o n s a r e listed

in A p p en d ix B.

2.3

T h e s ix r e p r e s e n ta tio n s o f d istr ib u tio n s

S y n t a x : T h e c o m m a n d

D esiredForm iRandom Variable [ , Statistic]) ;

re tu rn s th e list-of-lists form at of th e d esired functional re p re s e n ta tio n of th e d i s tr i b u ­

tion. w here DesiredForm is one of t h e following: PDF. CDF, SF, HF. CHF. or IDF. T h e

single a r g u m e n t RandomVariable m u s t b e in th e list-of-lists fo rm a t. T h e o p tio n a l

a rg u m e n t , Statistic m a y be a c o n s ta n t o r a string.

P u r p o s e : T h e 6 x 6 d is trib u tio n conversion ability, a variation of t h e m a t r i x o u tlin e d

by Leemis (1995, p. 55). is provided so t h a t th e functional form of a d i s t r i b u t i o n can

(35)

19

C H F . T h is set of p ro ced u res will ta k e o n e form of th e d is tr i b u tio n as a n a r g u m e n t

a n d re tu r n t h e d e s ire d form of th e d i s t r i b u t i o n in th e a p p r o p r ia te list-of-lists f o rm a t.

For th e o n e - p a r a m e t e r call, th e f u n c tio n a l r e p re s e n ta tio n will be r e t u r n e d . For t h e

tw o -p a ra m e te r call, t h e a c tu al value of t h e fu n c tio n at t h a t p o in t will b e r e t u r n e d .

S pecial Issues:

T h e procedures are fairly ro b u s t against n on-specified p a r a m e t e r s

for th e d is tr i b u tio n s t h a t will be c o n v e rte d (see t h e fourth e x a m p le below ).

E xam ples:

• To o b ta in th e C D F form of a s t a n d a r d n o rm a l ra n d o m variable:

> X := NormalRV(0, 1); > X := CDF(X);

or. eq u iv alen tly , in a single line.

> X := CDF(NormalRV( 0, 1 ) ) ;

Since th e C D F for a s ta n d a r d n o r m a l r a n d o m variable is n o t closed fo rm , A P P L

r e tu rn s t h e following:

X : = [[j- —► ^ e r f ( ^ x \/ 2 ) + -j], [—oc. oc], [C o n tin u o u s. CDF]]

• If A' ~ N ( 0 . 1), th e n th e following s t a t e m e n t s can be used to find P ( X < 1.96) =

0.975.

> X := NormalRV(0, 1); > p r o b := CDF(X, 1 . 9 6 ) ;

(36)

20

• S h o u ld th e h a z a rd function of an e x p o n en tial d i s t r i b u t i o n be e n t e r e d , its asso­

c ia te d P D F m a y be d e te rm in e d as follows:

> X := [ [ x -> 1 ] , [ 0 , i n f i n i t y ] , [ ' C o n t i n u o u s ' , ' H F ' ] ] ; > X := PDF(X) ;

• For th e case of unspecified p a r a m e te r s , th e following s t a t e m e n t s convert an

unspecified YVeibull P D F to a n unspecified W eib u ll SF :

> X := Wei bu ll RV( lambd a, k a p p a ) ; > X := S F ( X ) ;

w hich returns:

.V : = [[x —► e ' - r A *], [0, oo], [C o ntinu ou s. SF]]

N ote t h a t th e tildes afte r th e p a ra m e te rs in d ic a te t h a t a s s u m p ti o n s have been

m a d e concerning th e p a ra m e te rs (i.e.. A > 0 a n d k > 0) in t h e W eib u llR V

p ro ced u re.

• F in d in g a q u a n tile of a d is trib u tio n requires t h e I D F p ro c e d u re . If X ~

W eibull( 1.2), th e n th e 0.975 q u a n tile of th e d i s t r i b u t i o n can be found w ith

t h e s ta t e m e n t

> q u a n t := I D F ( W e i b u l l R V ( l , 2 ) , 0 . 9 7 5 ) ;

• T h e p rocedures c a n be nested so t h a t if th e r a n d o m v a r ia b le X has b een defined in te rm s of its P D F , th e n th e s ta t e m e n t

(37)

21

does n o th in g to th e list-of-lists r e p re s e n ta tio n for X, a s s u m in g t h a t all tra n s fo r­

m a tio n s can be perform ed analytically.

A l g o r i t h m : T h e conversions a re show n in a 6 x 6 m a t r i x in A p p e n d ix C. E ach

e lem en t of th e m a trix takes th e 'r o w ' a n d converts it to t h e ty p e specified in th e

"column" heading. T hus th e first row. second e le m e n t of t h e m a tr ix shows a call to

th e CDF p ro c e d u re using th e P D F re p re s e n ta tio n of a r a n d o m varia ble as a n a rg u m e n t

which r e tu r n s th e C D F re p re s e n ta tio n of a ra n d o m variable.

2.4

VerifyPDF

S y n t a x : T h e c o m m a n d

V e r if y P D F ( Random Variable) ;

re tu rn s t r u e or false, d e p en d in g on w h e th e r or not th e P D F in te g ra te s to one. T h e sin­

gle a r g u m e n t Random Variable m u s t be in th e list-of-lists fo rm a t d e s c rib e d previously.

In a d d itio n , th e p rocedure prints

‘T h e a r e a u n d e r th e P D F is '.

along w ith th e area , and “t r u e ” if t h e a re a is 1 . 0 or “false” if t h e a re a is not 1.0 .

P u r p o s e : T h e p u rpose of th is p ro c e d u re is to help d e t e r m in e if a r a n d o m variable in

th e list-of-lists fo rm at is in fact a v ia b le re p re s e n ta tio n of a co n tin u o u s d is trib u tio n .

Specifically, th e p ro ced u re co nverts th e d is trib u tio n to th e P D F form a n d carries o u t

th e definite in te g ra tio n of t h e P D F to see if th e a r e a u n d e r t h e P D F is 1. If so, it

(38)

T h e a r e a u n d e r th e P D F is . 1

a n d r e tu r n s "true": otherw ise it r e tu r n s th e c o m p u t e d a r e a a n d th e s tr in g "false" if

th e a r e a is m ore th a n 0.0000001 aw ay from 1. T h is p ro c e d u re is p rim a rily an in d ic a to r

tool to check if the list-of-lists fo rm a t of a r a n d o m v aria b le has b een in p u t correctly.

S p e c i a l is s u e s : T h e p ro c e d u re only integ rate s t h e a r e a u n d e r each s egm ent of th e

P D F of t h e a rg u m e n t Random Variable. It does n o t check for n e g a tiv e fu n ctio n al

values of f ( x ) . T h e th ird e x a m p le below shows t h e co n tin u o u s fu n ctio n

f [ x ) = 3 |x | — 1 — 1 < x < 1

in te g r a te s to one. yet is not a P D F since / ( 0 ) = —1.

For m a n y well-known d is tr ib u tio n s , th e p r o c e d u r e will c a rry o u t th e sym bolic

in te g ra tio n a n d verify th a t th e a re a u n d er th e P D F is one. as il lu s tr a te d in th e

second e x a m p le below. Not all of th e d is trib u tio n s d e s c rib e d in Section 2.2 have this

sym bolic capability, b u t m ost do. For ex am ple, th e unspecified log n o rm a l d is tr ib u tio n

will in t e g r a te to one. b u t th e unspecified inverse G a u s s ia n d is tr i b u tio n will not: see

th e fo u rth e x am p le below.

E x a m p l e s :

• T h e following M aple s t a t e m e n t s crea te an e x p o n e n t ia l r a n d o m varia ble X w ith

a m e a n of 1. verify t h a t th e a r e a u n d e r f ( x ) is one, a n d r e t u r n true from

(39)

23

> X := E x p o n e n t i a l R V ( l ) ; > V e r i f y P D F ( X ) ;

• Since a s s u m p ti o n s a r e m a d e in te rn a lly in Ex po ne nt i al RV a b o u t th e p a r a m e te r space, th e following two s t a t e m e n t s will also re tu r n true:

> X := E x p o n e n t i a l R V ( l a m b d a ) ; > V e r i f y P D F ( X ) ;

• T h e following c o d e defines a f u n c tio n f { x ) such t h a t f ( x ) d x = 1 a n d / ( 0 ) =

— 1. so t h a t V e r i f y P D F re tu rn s true ev en th o u g h th is is n o t a le g itim a te P D F :

> X := [ [ x -> 3 * a b s ( x ) - 1 ] , [ - 1 , 1 ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ; > V e r i f y P D F ( X ) ;

• M ap le is not ab le to c o n d u c t th e in te g ra tio n for m o re c o m p le x d is trib u tio n s . In

this e x a m p le . X is assigned t h e unspecified inverse G a u s s ia n d is tr ib u tio n , and

an a t t e m p t to i n t e g r a t e th e a r e a u n d e r t h e d en sity is unsuccessful.

> X := I n v e r s e G a u s s i a n R V ( p l , p 2 ) ; > V e r i f y P D F ( X ) ;

T h e s e s t a t e m e n t s r e t u r n an e rro r m essage in d icatin g t h a t t h e fu n c tio n does not

e v a lu a te to n u m e ric . T h e a s s u m p tio n is m a d e t h a t f u tu r e releases of M aple will

be ab le to c o rre c tly in t e g r a te th is P D F .

A l g o r i t h m : T h e a l g o r i t h m first checks to see w h e th e r t h e d i s t r i b u t i o n of in terest

is co n tin u o u s . N ex t, it checks to see if t h e d is tr ib u tio n of t h e r a n d o m variable is

re p re s e n te d by a P D F . If n o t, it converts a local d is tr ib u tio n to a P D F fo rm using th e

(40)

2 4

th e P D F is calc u la te d a n d p rin te d . T h e re tu rn e d value fro m th e pro ced u re is “ irue"

if th e are a is w ith in 0.0000001 of 1. a n d “false" oth e rw ise. T h e a lg o rith m is given in

A p p e n d ix D.

2.5

MedianRV

S y n tax :

T h e c o m m a n d

MedianRV (/ta n d o m Variable) ;

re tu r n s th e m e d ia n of a specified d is trib u tio n .

P u rp o se :

T h is p r o c e d u re r e tu r n s t h e m edian of a r a n d o m variable.

S pecial Issues:

It is fairly ro b u s t for use w ith d is tr i b u tio n s t h a t have unspecified p a ra m e te rs .

E x am p les:

• For th e fully-specified W eibull d is trib u tio n , th e following s ta t e m e n t s will assign

th e m e d ia n of th e d is tr i b u tio n to the variable m.

> X := W e i b u l l R V ( l , 2 ) ; > m := MedianRV(X);

• T h e following s t a t e m e n t s d e t e r m in e th e m e d ia n of a n e x p o n e n tia l r a n d o m vari­

able w ith unspecified p a ra m e te rs :

> X := E x p o n e n t i a l R V ( l a m b d a ) ; > m := MedianRV(X);

(41)

25

A l g o r i t h m : T h e a lg o rith m is a special case of th e tw o - p a r a m e te r IDF p ro c e d u re call,

w here th e second p a r a m e t e r is j .

2.6

DisplayRV

S y n t a x : T h e c o m m a n d

D is p la y R V (R andom Variable) ;

displays th e list-of-lists fo rm a t of th e d is trib u tio n in s t a n d a r d m a t h e m a t i c a l n o ta tio n ,

using t h e M aple p i e c e w i s e procedure.

P u r p o s e : T h e p u rp o s e of th is p rocedure is to m a k e th e list-of-lists re p re s e n ta tio n

of a d is tr i b u tio n m o re read a b le. A long list-of-lists w ith several seg m e n ts is not easy

to u n d e r s ta n d . T h is p ro c e d u re converts a list-of-lists f o r m a t te d d is trib u tio n into

th e M a p le -s y n ta x e d "piecewise" function. Such versions of s e g m e n te d functions are

displayed in a m o re re a d a b le m a n n e r in M aple. It also s ta t e s w h e th e r th e c u r re n t

re p r e s e n ta tio n is a P D F . C D F . etc. T h e re is no c o m p u ta ti o n in th is pro ced u re. T h e

p ro c e d u re a t t e m p t s to m a k e th e list-of-lists fo rm at m ore readable.

S p e c i a l I s s u e s : None.

E x a m p l e :

• T h e piecewise tr i a n g u la r d is trib u tio n could b e displayed as follows:

> D i s p l a y R V ( T r i a n g u l a r R V ( l , 2 , 3 ) ) ;

(42)

26

This random variable is currently represented as follows:

'Continuous'.P D F '

0

x < 1

< x — 1 x < 2

3 — x x < 3

A l g o r i t h m : T h is a l g o ri th m is a set of c o m m a n d s t h a t c re a te s a sequence of c o n d itio n s

a n d fu n c tio n s in a m a n n e r t h a t is usable by th e p i e c e w i s e c o m m a n d .

2 .7

P l o t D i s t

S y n t a x : T h e c o m m a n d

P l o t D i s t (R a n d o m Variable. LowerLimit, U p p e rL im it);

plots t h e c u rre n t list-of-lists defined d is tr ib u tio n betw ee n Low erLim it a n d UpperLimit

on a c o o r d in a te axis.

P u r p o s e : To give a g r a p h ic a l re p re s e n ta tio n of any list-of-lists re p resen ted d i s t r i b u ­

tion. T h e a r g u m e n ts L ow erL im it a n d UpperLimit define t h e m in im u m a n d m a x i m u m

values d e s ire d on t h e h o riz o n ta l axis.

S p e c i a l I s s u e s : A d i s tr i b u tio n function m u s t be fully-specified for a p lo t to be

g e n e ra te d . T h e p r o c e d u r e is especially useful for p lo t tin g d is trib u tio n s t h a t have

(43)

E x a m p l e s :

• T h e following s t a t e m e n t s will g e n e ra te th e plot of t h e P D F for th e tr i a n g u la r ( 1.

2. 3) d is trib u tio n :

> X := T r i a n g u l a r R V ( 1 , 2 , 3 ) ; > P l o t D i s t ( X , 1, 3 ) ;

• To p lo t th e H F of t h e e x p o n e n t i a l 1 ) d is trib u tio n for 0 < t < 1 0. e n t e r t h e

s t a t e m e n t s :

> X := E x p o n e n t i a l R V ( 1 ) ; > P l o t D i s t ( H F ( X ) , 0 , 1 0 ) ;

• To see a progression of th e five P D F s of th e o rd e r s t a t i s t i c s (th e p r o c e d u re is

in t r o d u c e d in S e c tio n 2.10) for an e x p o n e n t i a l 1) d i s t r i b u t i o n , one could e n t e r

th e following s t a t e m e n t s : > X := E x p o n e n t i a l R V ( 1) ; > n : = 5 ; > F o r i f r o m 1 t o n do P l o t D i s t ( O r d e r S t a t ( X , n , i ) , 0, 1 0 ); o d;

T h e re su lt is five P D F s p lo t te d sequentially. T h is s e q u e n c e could be of use to

an in s t r u c t o r e x p l a in in g th e progressive n a tu re of o r d e r s ta tis tic s to first-y ea r

p r o b a b il it y s tu d e n t s .

• U nspecified d i s t r i b u t i o n s p ro d u c e “e m p ty p lo t” w arnings:

> X := E x p o n e n t i a l R V ( l a m b d a ) ; > P l o t D i s t C X , 0 , 1 0 ) ;

(44)

28

A lg o rith m :

T h e a lg o rith m is a n e ste d set of P l o t c o m m a n d s t h a t com bine to form a single plot. T h is is s ta n d a r d M ap le p r o g r a m m in g for p lo t tin g m u ltip le functions on

a single set of axes. Since — oc a n d oo are co m m o n e n d p o in ts of ra n d o m variables, it

is necessary to specify th e lower a n d u p p e r e n d p o in ts of th e horizontal axis.

2.8

ExpectationRV

S y n ta x :

T h e c o m m a n d

E x p e c t a t i o n R V ( / ? a n d o m Variable, Function) ;

r e tu r n s th e e x p e c te d value of a fun ctio n of a r a n d o m variable.

P u rp o s e :

To find th e ex p e c te d value of a fu n ctio n of a r a n d o m variable.

S p e c i a l I s s u e s : P ro ced u res MeanRV a n d Va ri an ce RV are th e special cases of the

E x p e c t a t i o n R V pro ced u re, ev id en t by th e ir nam es.

E x am p les:

• In o rd er to find th e e x p ec ted value of a s t a n d a r d n o rm a l ra n d o m variable, type:

> X := NormalRV(0, 1) ;

> meanX := E x p e c t a t i o n R V ( X , x -> x ) ;

• U nspecified d is trib u tio n s m a y also be used. H ere is t h e m e a n of th e exponential(A )

r a n d o m v aria b le is calcu la ted w ith th e s ta te m e n ts :

> X := E x p o n e n t i a l R V ( l a m b d a ) ;

(45)

29

A l g o r i t h m : T h e algorithm is a s tra ig h tfo rw a rd im p le m e n ta tio n of th e following

resu lt. Let th e continuous ra n d o m varia ble X have P D F f x i * ) - Let g ( X ) be a

c o n tin u o u s fu n c tio n of th e X . T h e e x p e c te d value of <7(A”), when it exists, is given

by E[ g{ X) \ = [ 9 { x ) - f x ( x ) d x . J—oc T h e a l g o r i th m is in A ppendix D.

2.9

Transform

S y n t a x : T h e c o m m a n d T r a n s f o r m(R a n d o m Variable, Transformation) ;

re tu r n s t h e P D F of the tra n s fo rm e d r a n d o m variable in t h e list-of-lists fo rm a t.

P u r p o s e : To d e te rm in e th e P D F of th e tra n s fo rm a tio n of a ra n d o m varia ble o f th e

form Y' = g { X) . As is the case for th e r a n d o m variable A \ th e tra n s fo rm a tio n fu n c tio n

g ( X ) m a y be defined in a piecewise fashion (see c h a p te r 3).

S p e c i a l I s s u e s : T h e tra n s fo rm a tio n fu n c tio n m ust also be defined in an a l t e r e d list-

of-lists fo r m a t. For this function, t h e m o d e le r m ust break th e tra n s fo rm a tio n into

piecew ise m o n o to n e segments. D etails on why this m ust be th e case, in a d d i t i o n to

o th e r im p le m e n ta t io n issues a re given in C h a p te r 3.

E x a m p l e s :

• Let A' ~ U ( 0 , 1) an d Y = g ( X ) = 4 A . T h e following s ta te m e n ts will g e n e r a t e

(46)

3 0

> X := [ [ x -> l ] , [ 0, 1 ] , [ ' C o n t i n u o u s ' , ' P D F ' ] ] ; > g := [ [ x -> 4 * x ] , [ - i n f i n i t y , i n f i n i t y ] ] ; > Y := T r a n s f o r m ( X , g ) ;

• T h e following s ta t e m e n t s d e t e r m in e t h e d is trib u tio n of t h e sq u a re of an inverse

G a u s s ia n ra n d o m variable w ith A = 1 a n d /z = 2:

> X := I n v e r s e G a u s s i a n R V ( l , 2 ) ;

> g := [ [ x -> x “ 2 ] , [0 , i n f i n i t y ] ] ; > Y := T r a n s f o r m ( X , g ) ;

• A n e x a m p le of finding th e n e g a tiv e of a ra n d o m v a ria b le is included in S ectio n

2.12 on t h e c o m m a n d SumRV. used in finding differences o f ra n d o m variables.

• An e x a m p l e of finding th e recip ro cal of a ra n d o m v a ria b le is included in S ectio n

2.11 o n t h e c o m m a n d P ro d u c tR V , used in finding ra tio s of r a n d o m variables.

• An e x a m p le of dividing a r a n d o m variable by a c o n s ta n t is included in S ectio n

2.15 on t h e c o m m a n d MLE. used to th e find d is tr ib u tio n s of ce rta in e s ti m a t o rs .

• A n u m b e r of o th e r illu s tra tiv e ex a m p le s are given in C h a p t e r 3.

A l g o r i t h m : T h e th e o re m which p rovides th e basis for th e a l g o ri th m a n d th e d e ta ils

asso ciated w ith t h e a lg o rith m are found in C h a p te r 3. T h e a l g o ri th m is in A p p e n d ix

(47)

31

2 .1 0

O rderStat

S y n t a x : T h e c o m m a n d

O r d e r S t a t (R andom Variable, n , r) ;

r e tu r n s t h e P D F of t h e r th of n o rd e r statistic s d ra w n fro m a p o p u la tio n having th e

sa m e d is trib u tio n as R a n d o m Variable.

P u r p o s e : T h is p r o c e d u r e is desig n ed to re tu rn th e m a rg in a l d i s tr i b u tio n of specified

o rd e r s ta tis tic s. T h e p r o c e d u r e s a rg u m e n ts are defined as follows: th e p o p u la tio n

d is tr i b u tio n is r e p re s e n te d by t h e list-of-lists fo rm a t, t h e in te g e r s a m p le size n, an d

th e integ er r to d e n o t e t h e r th o rd e r s tatistic . T h e p r o c e d u r e r e t u r n s th e m arg in al

P D F for th e r th o r d e r s t a t is tic in th e list-of-lists fo rm a t. T h e p ro c e d u re is a direct

im p le m e n ta tio n of t h e w id ely -p u b lish ed th e o re m on th e d i s t r i b u t i o n of th e order

s ta tis tic s (e.g., Larsen a n d M a rx , 19S6. p. 145).

S p e c i a l I s s u e s : T h is p r o c e d u r e is ro b u s t for unspecified p a r a m e t e r s in th e p o p u la tio n

d is tr ib u tio n . It is also fairly ro b u s t a t re tu rn in g th e a p p r o p r i a t e P D F w hen e ith e r

n or r is unspecified. It is also ro b u s t when dealing w ith m o re t h a n one s egm ent in

a P D F . T h is p ro c e d u r e was a c o rn e rs to n e p ro ced u re t h a t allow ed th e goodness-of-fit

c o n tr ib u tio n s d iscussed in C h a p t e r 6 in this d is s e rta tio n .

E x a m p l e s :

• T h e P D F of t h e th i r d o rd e r s ta tis tic from a s a m p le o f five ite m s d is tr i b u te d

References

Related documents

simultaneity, fragmentation, contamination and constraint predict greater negative work- to-family spillover; in other words, the temporal conditions that have emerged in today’s

[r]

(The default value is the same as the Extension General Use field on the Site Details: Automatic Design Definition window at the time that the NE is created. Choices displayed will

Repermission Campaigns are designed to confirm the opt-in status of list members and are used when a list has aged significantly without being put to use or has been collected via

• The very short summer during the transition year – we all must be aware 

y We will retain 100% of the invoice total for events canceled within 48 hours of the event; y All charges for cancellations fees will be charged to the credit card on file

Subsequently, in the implementation of wikis in the replacement course, tutors (who were all obliged to work with the wiki) were rather negative about using the wiki for their

on peak knee extensor torque, kicking foot velocity at ball contact, and the