An Automated procedure for simulating complex arrival processes: A Web-based approach

(1)

Rochester Institute of Technology

RIT Scholar Works

Theses

Thesis/Dissertation Collections

11-1-2003

An Automated procedure for simulating complex

arrival processes: A Web-based approach

Sachin Sumant

Follow this and additional works at:

http://scholarworks.rit.edu/theses

This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please [email protected].

Recommended Citation

(2)

Rochester Instituteof

Technology

AN AUTOMATEDPROCEDUREFORSIMULATING

COMPLEX ARRIVAL PROCESSES:

A WEB-BASEDAPPROACH

A Thesis

Submitted inpartialfulfillmentofthe

requirementsforthedegreeof

MasterofSciencein Industrial

Engineering

inthe

DepartmentofIndustrial & Systems

Engineering

Kate Gleason Collegeof

Engineering

by

SachinG.Sumant

B.E.,

Production

Engineering,

V. J. T.

I., University

of

Bombay,

2000

(3)

DEPARTMENT OF INDUSTRIAL AND SYSTEMS ENGINEERING

KATE GLEASON COLLEGE OF ENGINEERING

ROCHESTER INSTITUTE OF TECHNOLOGY

ROCHESTER, NEW YORK

CERTIFICATE OF APPROVAL

M.S. DEGREE THESIS

The M.S. Degree Thesis of Sachin G. Sumant

has been examined and approved by the

thesis committee as satisfactory for the

thesis requirement for the

Master of Science degree

Approved by:

Dr. Michael E. Kuhl, Thesis Advisor

(4)

Thesis Reproduction Permission Statement

Permission granted

Title of thesis

AN AUTOMATED PROCEDURE FOR SIMULATING COMPLEX ARRIVAL

PROCESSES: A WEB-BASED APPROACH

I, Sachin G. Sumant, hereby grant permission to the RIT Library of the Rochester

Institute of Technology to reproduce my thesis in whole or in part. Any

reproduction will not be for commercial use or profit.

(5)

ACKNOWLEDGEMENTS

I am short of words to express _my gratitude towards _{Dr. Kuhl. Not only has he been} a

phenomenal academic _advisor, but also a _mentor, _assisting me on several occasions not

directly

relatedto_my academic career.

I would like to thank Dr. _{Sudit for serving} as a member on _my thesis committee. I am

grateful to Dr. Wilson from North Carolina State

University

_{for providing} guidance and

valuable inputs related to this research. I wish to thank Mr. Madhu Nair for his assistance

during

mythesis work.

Specialthanks toDr. Mozrall for hersupport andencouragement

during

theproject.

I appreciatethesupport of_{my friends} whohave beenexcellent companions andIthankthem

for making my stayatRochester Instituteof

Technology

enjoyable. This workis dedicatedto

myparentsand_my elderbrother for motivating me tocome sofarand for

teaching

meto be

(6)

ABSTRACT

In

industry,

simulationisone ofthemost _widelyused probabilistic _modelingtools for

modeling

highly

complex systems.Majorsources of_{complexity include} the inputsthatdrive

the logic ofthe model. Effective simulation _{input modeling} requires the use ofaccurate and

efficient input _modeling procedures. This research focuses on _{nonstationary} arrival

processes. The fundamental stochastic model on which this _{study is} conducted is the

nonhomogeneous Poisson process

(NHPP)

which _{has successfully been} used tocharacterize

arrival processes where the arrival rate changes over time. Although a number of methods

existfor modelingthe rate and mean valuefunctions thatdefinethe behaviorof

NHPPs,

one

of the most flexible is a multiresolution procedure that is used to model the mean value

function forprocesses _{possessing long-term} trends over time or asymmetric, multiple cyclic

behavior. In this research, a statistical-estimation procedure for _automating the

multiresolution procedure is developed that involves the

following

steps at each resolution

level corresponding to a basic cycle:

(a)

transforming

the cumulative relative

frequency

of

arrivals within the cycle to obtain a linear statistical model

having

normal residuals with

homogeneous variance;

(b)

fitting

specially formulated polynomials to the transformed

arrival

data;

(c)

_performing a likelihood ratio test to determine the degree of the fitted

polynomial; and

(d)

fitting

a polynomial of the degree determined in

(c)

to the original

(untransformed)

arrival data.

Next,

an experimental performance evaluation is conducted to

test the effectiveness of the estimation method. A web-based application _{for modeling}

NHPPs using the automated multiresolution procedure and _generating realizations of the

NHPP is developed.

Finally,

a web-based simulationinfrastructure that integrates _modeling,

(7)

TABLE OF

STATEMENT

5

3.LITERATURE REVIEW 7

3.1 Simulation Input

Modeling

7

3.2

Estimating

NHPPs 9

3.2.1 Poisson Processes 10

3.2.2Nonhomogeneous Poisson Processes 12

3.3MultiresolutionProcedure for

Modeling

NHPP 16

3.4 Estimation Procedures 24

3.4.1 Maximum Likelihood Estimation 24

3.4.2 Hypothesis

Testing

26

3.4.3 Likelihood Ratio Tests 27

3.5 Web-Based Simulation 28

4. AN AUTOMATED MULTIRESOLUTION PROCEDURE FOR MODELING

NHPPS 34

4.1 Input

Modeling

andtheNeedfor Automation 34

4.2 Selectionof anAppropriate Function for R.

()

35

4.3

Estimating

Rt()

38

4.4 Likelihood Ratio Test Procedure 39

4.5 An Example

Illustrating

theAutomated MultiresolutionProcedure 48

5. EXPERIMENTAL PERFORMANCE EVALUATION 55

5.1 GenerationofExperimental Data 55

5.2 FormulationofPerformance Measures 56

4.3 Presentation andAnalysisofResults 63

6. WEB-BASEDIMPLEMENTATIONOFMULTIRESOLUTIONPROCEDURE

FOR MODELINGNHPPS 79

6.1 Application Requirements 80

6.2 Software Selection 81

6.2.1 Static Web Pages 81

6.2.2 Dynamic Web Pages 82

6.3 Application DesignandDevelopment 84

6.3.1 Input Data 87

6.3.2 Estimate Data 89

6.3.3 Output Data 90

6.4 Web-Based Multiresolution Procedure (WBMP): An Example 92

6.4.1 Generate Data 92

(8)

7. WEB-BASED

SIMULATION

ENVIRONMENT: A CONCEPT 102

7.1 General Issues 102

7.2 The Web-Based Simulation Environment Concept 104

7.3 Web-Based Simulation Environment: Design 106

7.3.1 Web-BasedInput

Modeling

106

7.3.2Web-BasedOutput Analysis 109

7.3.3 Web-based Model

Building

112

7.3.4Web-basedVerificationandValidation 114

8.

CONCLUSIONS

AND FUTURE WORK 117

REFERENCES 120

Appendix A: VB.Net ProgramtoRun anExecutableon the Server 124

AppendixB: Input Data for Web-Based Data

Fitting

Algorithm 126

Appendix C: Input Data for Data GenerationAlgorithm 127 Appendix D: Set

Up

Procedure for Web-Based Input

Modeling

Environment 128

(9)

1.INTRODUCTION

Simulation is the process of

designing

and _creating amodel ofa real or proposed

system via a mathematical-logical _model, _conducting experiments on the model, and

drawing

inferences from the model about the operation of thereal system (Kelton et al.

2002,

Kuhl 1997). Simulation can be appliedto describe and analyze the behavior of a

system, analyze "What if...?" questions about systems under _study, compare alternative

system _{configurations,} optimize system _performance, and aid in the design of new

systems. Although other operations research tools such as _queuing _theory, linear

programming, etc. can also be used for these types of studies and provide closed form

analytical _solutions, the _simplifying assumptions _necessary to use these tools _may result

in models thatdo not represent the true behaviorof the system. The powerofsimulation

when comparedto otheranalysis toolsis in the _abilitytoanalyze complex systems.

Thus,

in the case of complex systems

involving

_variability and interdependencies between

system _components, simulation provides the capabilities _{necessary for obtaining} valid

solutions.

In _general, simulation studies can be divided into six basic processes, namely,

problem

formulation,

creation of a conceptual_model,creation of a computermodel, input

modeling, verification and _validation, and experimentation. Problem formulation

involves

developing

problem _statement,

determining

the objectives of_study and_planning

the study. After

formulating

the problem, thenext taskis to _study the real worldprocess

and create a theoretical model, which is called the conceptual model. The conceptual

model defines the aspects ofthe system that should be included in the _study as well as

(10)

established, a computer simulation model is constructed. The simulation model, like all other _modeling _techniques, requires inputs. For simulation, the input _modeling

procedures encompass all ofthe activities needed to _specifytheparameters thatdrive the

simulation model

including

data_{collection, analysis,} andrandom variate generation. The

verification and validation process consists of_checking the operation of the model and

determining

ifthe model _accurately represents the real system.

Finally,

experimentation

includes

designing

and_performing a set ofexperiments, analyzing the simulation outputs

and

drawing

theinferences aboutthe system.

Of the six processes mentioned above, one ofthe most important and influential

processes in any simulation _{study is input}modeling. Input modelingperforms arole of

an interpreterofthereal worldsystemforthe simulation model. When available, dataare

collected from the real system, and probabilistic input models are fit to the data

by

applyingvarious statisticalalgorithms.

Then,

themodels areused inthe simulation model

tocreate randomprocesses_representingthe system's arrivalprocesses, service_times, etc.

Since input modelingprovidesinputs to thesimulation _model, theoverallperformanceof

the _{study depends}

heavily

on the _accuracy ofthe inputmodels. Although there is alarge

body

of research in the area of_{input modeling for stationary} _processes, thederivation of methods _{for estimating} and _assessing the goodness of fit of models applicable to

nonstationaryprocesses_{is ongoing}

(Banks,

1998).

Nonstationary

arrival processes are _routinely encountered in _practice, and thus

being

able to generate realizations of _nonstaionary processes in simulation models is

essential. The arrival of customers to a

bank,

demand forseasonal products such as lawn

(11)

and the arrival oftelephone calls at call centers, are sometypical examples of situations

wherethearrival rate changes overtime. One inputmodelthathas beenused_successfully

in thepast tomodel a large class _{nonstationary}arrival processes is the nonhomogeneous

Poisson process (NHPP). The

key

to _{modeling NHPP's is} tocharacterize the arrival rate

as the rate changes over time. Both parametric and non-parametric methods have been

developed to model the arrival rate (These methods arediscussed in detail in Chapter 3).

In many simulation _studies, _{nonstationary} arrival processes exhibit long-term trends or

multiple periodic behaviors. For example, in analyzing the arrival streams of

liver-transplant donors and patients for the UNOS Liver Allocation

Model,

Pritsker et al.

(1995)

observed some arrivalratesto reveal significant growth overtime as well as

daily,

weekly and annual effects. To model such type of NHPPs with

long

term trend and

periodic _effects, Kuhl and Wilson

(2000)

derive a flexible theoretical procedure called

themultiresolution procedure.

Kuhl and Wilson

(2001)

present the

theory

behind the multiresolution procedure

andshow examples of how theirmethod works.

However,

these examples require some

simplifying assumptions and require the user to determine the _adequacy of the fitted

model. Inthis thesis, statistical procedures are developedthatcanbe usedto automate the

model

fitting

process for the multiresolution procedure without the need for _simplifying

assumptions or intervention

by

the user. The statisticalprocedures are implemented in a

computer _program, and an experimental performance evaluation of the procedure is

conducted.

The dominance oftheInternet in the development ofinformation and

technology

(12)

to disseminate the automated NHPP

fitting

_procedure, a web-based input modeling

interface is developed. Basedon a thorough literature _{review, this} is the first web-based

input modeling software of its kind.

Thus,

some unique challenges were encountered in

the developmentof web-basedsystem. Included in theweb-based input modelingsystem

is the _{ability for}the userto submit a _{file containing} collected data (arrival _times) and to

specify parameters of the

fitting

procedure. Graphs of the fitted NHPP and each of the

componentsofthemutliresolutionprocedure are plotted versustheobserveddata. Output

files ofthe parametersare alsogenerated and can be downloaded

by

theuser. In addition,

the web-based system canbeused togenerate realizationofthefitted NHPP thatcouldbe

usedin simulation experiments.

Research intothe area ofweb-based input modeling, has led to the investigation

of a web-based simulation infrastructure that would allow users to conduct complete

simulation studies on the web. The available functions would include the _ability to

(a)

upload a simulation_model, or at some pointin the

future,

tobuild a simulation modelon

the web;

(b)

perform verificationandvalidation over_{the web;}

(c)

conduct_{input modeling}

and _analysis; and

(d)

conductoutput analysis. The web-basedinput modeling module is

demonstrated through the implementation of the NHPP

fitting

procedure. Although full

development of each of the fourmain components is beyond the scope ofthis _thesis, the

infrastructure for each of the modules will be discussed and demonstrated with simple

(13)

2.PROBLEM STATEMENT

The focus of this research is on _modeling and simulation of complex arrival

process. In particular, the research examines the multiresolution procedurefor estimating

the mean value function of nonhomogeneous Poisson processes

having

long

term trends

or nested cyclic behavior over timewith the goal of

being

ableto generate realizations of

theNHPPs in simulation experiments. This work extendsthe _capabilityand

functionality

ofthe theoretical basis forthe multiresolution procedure established

by

Kuhl andWilson

(2001)

by

providing an automated statistical analysis _{methodology for modeling NHPP}

that is implementedin aweb-basedinput modelingand generation software.Theresearch

workhas fourmainobjectives,and each objective has itsown uniquechallenges.

Thefirst objectiveofthis thesisis to automatethe multiresolution procedure. The

multiresolution procedure estimates mean value function of a NHPP. The mean value

function is calculatedfromthefunctions fittedto

long

term trends and all periodiceffects

(i.e. to all resolutions).

Therefore,

to automate the multiresolution procedure, the first

task is to determine a

function,

which should not _{only fulfill} multiresolution procedure

requirements but also demonstrate ability to model complex trends and cyclic effects in

data.

Next,

a procedure has to be determined to estimate the functions at all the

resolutions. After this research, a computerprogram needs to be developed and tested.

Finally,

a complex example has to be selected to validate the results of the automated

multiresolution procedure.

The second objective of this research _{involves conducting} _rigorous _experiments

(14)

circumstances. This experimental performance evaluation includes design of a set of

experiments,determinationof performance measures and

finally

analysisoftheresults.

The third objective concentrates on design and development of web-based

multiresolution procedure. This applied research encompasses selection of appropriate

software and

hardware,

designanddevelopment ofthe web basedinfrastructure for input modeling, and

finally

implementation and

testing

of the infrastructure with the

multiresolutionprocedure algorithm.

The final objective consists of

developing

a conceptual web-based simulation

environment.

According

to Kuljis andPaul

(2000),

web-based simulation researchers are

exploring all new directions but still not able to achieve the full advantages of the

internet. This thesis work studies the latest web-technologies and their _{capabilities,}

analyzes approaches taken in web-based simulation

literature,

and

finally

designs a

conceptual web-based simulation environment, which can utilize the benefits of the

Internet benefits to conduct simulation studies. In addition, the simulation environment

(15)

3.LITERATURE REVIEW

The focus of this research is to automate the multiresolution procedure for

modelingnonhomogeneousPoissonprocesses and toillustrate web-basedinput modeling

by developing

a web-basedimplementation ofthe multiresolution procedure.Toestablish

a basis for the automation of the multiresolution procedure, simulation input modeling

methods

induing

nonhomogeneousPoissonprocesses andtheir characteristics, parameter

estimation_procedures, andlikelihoodratio testsarediscussed. A detailed summaryofthe

multiresolution procedure and its characteristics is also provided.

Finally,

the current

researchinthefieldof web-basedsimulationisreviewed.

3.1 Simulation Input

Modeling

When asimulation model is created, there are_{many details} thatmustbe specified

in order to define a valid simulation model. These details include simulation model

inputs. Theproceduresto determinetheseinputsarereferred as_{input modeling}methods.

Figure 3.1 illustrates the _{input modeling} process

(Banks,

1998). In general, this

input modeling process is the method

by

which the process andevents that occur in the

real system are translated into mathematical models that will be used in the simulation

program.

Thus,

each simulation model inputcorresponds to aprocess oreventin the

real-world system. The input modeling process requires combination of

data,

relevant

applicable

theory

(theory

of stochastic processes or related to the system under

consideration) and prior experience. The data required for input modeling are collected

from the real world.

Sampling

methods are used if observation data are available. The

application of

theory

on _{data is completely influenced}

by

the characteristics of the

(16)

experience of the practitioner or expert opinion can be valuable in ascertaining proper

understanding of current _process, appropriate application of

theory

and valid results.

Various input modeling strategies are developedandused tofitthe input model

(Nelson,

1995). To beuseful, theinputmodel thatis developedmustbe bothreasonableand valid.

That

is,

themodel must _{be statistically}sound and at the same time be able to _accurately

represent the system under study.

Thus,

an assessment of the reasonableness of the

resulting model anda_validityassessmentofthe_resultingmodelmustbe made. Thefitted

inputmodel is then passedto thesimulation program wherethe random variate generator

employs theinputmodel tocreate realizations oftheprocess.

Validity

Assessment

?

Real World Process

Reasonableness

Assessment

Sampling

Data

Theory

Past Experience

Modeling Strategy

[image:16.520.46.476.321.573.2]

Input Model

(17)

In practice, there are _many different types of processes and _{data resulting in}

various _modeling strategies.

Here,

the thesis focuses on processes from which

observational datacan beobtained to which stochastic process models and

theory

can be

applied.

3.2

Estimating

NHPPs

Any

process that has a random element is considered a stochastic process.

Stochastic processmodels are waysof_{statistically analyzing}the dynamicrelationshipsof

sequences of random events (Cox and

Isham, 1980,

Taylor and

Karlin,

1994). Random

processes occur _very often in almost _{every field} of natural and _engineering sciences.

Stochastic modeling can be applied to analyze the _{variability inherent in} _{manufacturing}

systems, biological and medical _{processes, to} deal with uncertainties _{affecting decision}

making, and with the complexities ofpsychological and social interactions. In addition,

stochastic _modeling can provide viewpoints, methods, models and insightto aid in other

mathematical and statistical studies.

Stochastic processes can be classified as

being

_stationary or nonstationary. A

process is said to _{be stationary if} the process mean and the process _variability remain

constantovertime. _{A nonstationary}process,

however,

changes overtime.

In the case of _stationary stochastic processes, a random sample of independent

and

identically

distributedobservationscanbe collectedand a_{probability distribution}can

be fit to the data. A

three-step

procedure usedto fit theoretical distribution to the data is

asfollows:

1. Hypothesize a candidate distribution:

first,

ascertain whether the current process is

(18)

2. Estimate the parameters of the hypothesized distribution: For example, if the

hypothesis is that the _{underlying data} are _exponential, the parameter to be estimated

fromthe data istheexponential rate.

3. Perform goodness-of fit test such as the chi-squared test: If the test rejects the

hypothesis that is a _{strong indication} that the hypothesis is not true. In that case, go

back to_step

1,

or_applytheempirical distribution ofthedata.

In the case of _{nonstationary} stochastic _process, complete observations of the

processes over the time interval of interest are needed to fit a _{nonstationary} stochastic

model. One _model, which has successfully been used to represent a large class of

processes, is the nonhomogeneous Poisson process (NHPP). The NHPP is a

generalization of the _stationary process known as the

(homogeneous)

Poisson process.

Therefore,

to aidin

discussion,

the properties ofthe

(homogeneous)

Poisson process will

bepresentedin thenext section priorto

discussing

NHPPs.

3.2.1 Poisson Processes

A Poisson process is a particular type of stochastic process referred to as a

countingprocess, pointprocess, orbirthprocess.

Namely,

aPoissonprocess

Nt

measures

the number of _{randomly occurring} events that have occurred in the time interval

(0,0-The Poisson process is characterized

by

the average rate of occurrence of _events, X. In

addition, aPoissonprocess hasthe

following

properties:

1. Thenumberof events atthe start oftheprocess (time

0)

iszero.

2. The process has independent increments. That

is,

the number of customers _arriving

during

one time interval does not affect the number _arriving

during

a different time

(19)

(non-3. The average number of events that occurin an interval oftime is proportional tothe

lengthoftheinterval.

4. Customersarrive one at atime.

Formally,

theseproperties havethe

following

mathematicalformulation

1.

Nt

= 0.

2. _{For any}time_t, s

>0,

thedistributionof

Nt+S

-Nt

is independentoft.

3. Theaverage rate meansE[Nt]=

At

_{(A is}_the _{average rate).}

4. Foralmost allrealizations, forsmall _s, andA.as therate

P{N(t,t+_s)=

0}

=_l-As₊ o(s);

P{N(t,t+ _s)=

1}

=As₊

o(s); and

P{N(t,t+_s)>l} = o(s)

for any t,

Nt+S

-Nt

is independentof

{

Nu

_; u <t}.

A stochastic process

Nt

_satisfying the above assumptions is called a Poisson

process with rate parameter A

(

Cinlar, 1975, Bhat,

1972).

Furthermore,

the number of

events thatoccurin aninterval oftime has aPoisson distribution with a_probabilitymass

functionofthe

form,

. .

(AtVe-M

p(x) =

; , X

- 0,1,2,. ..

(20)

In _{addition, the} distribution of the time between events has an exponential distribution

with a mean of\IAandhasthe

following

density

function,

Ae'M,t>0.

The Poisson process _{is commonly} used in simulation to model arrival _process, such as

arrival of parts to a machine in a _{manufacturing} _system, the arrival of customers to an

ATM,

andthenumberofdefects in aboltofmaterial.

3.2.2 Nonhomogeneous Poisson Processes

In many practical _situations, stochastic processes such as arrival processes are

nonstationary. That

is,

the arrival rate changes over time. Examples of this _{may include}

arrivalstoabankoverthecoarseofa

day,

demand for lawn mower over_{the year,} andthe

demand foraproduct overits life cycle. Amodel

frequently

used tomodel these types of

nonstationaryprocesses is thenonhomogeneousPoissonprocess.

AnNHPP

{N(t),t>Q}

is a _countingprocess where

N(t)

is thenumber of arrivalsin

the time interval

(0,

_t] wheretheinstantaneous arrival rate attime t,

X(t),

is anonnegative

integrable functionoftimeandthe_{corresponding}

(cumulative)

mean valuefunction is

(21)

The rate or mean value function of the _{NHPP completely} characterizes the

probabilistic behavior ofthe process

(Cinlar,

1972,

Cox and

Isham,

1980).

Furthermore,

the

following

propertiesholdforan NHPP:

N(0)

= 0 ;

P{N(t,

t +_s)=

0}

=₁

-A(t)s

+_{o(s) ;}

P{N(t,t+_s)=

l}

=A(t)s₊

o(s); and

P{N(t,t+_s)>l}=_o(s).

Therefore,

by

accurately modelingthe rateormean value

function,

the NHPPcan

be completely defined and used to generate realizations of the process in the simulation

model. Both parametric andnonparametric methods have been developedto estimatethe

rate or mean value function from observed data. Leemis

(1991)

developed a simple

nonparametric method to _specify a observed piecewise-linear estimate where the

breakpoints are determined

by

the observed arrival times in a superposition of several

replicated observations ofthe process. Lewis and Shedler

(1976b)

derive techniquesfor

estimating the rate function to model the transactions in a database system. One

nonparametric estimatorthat

they

derive is

X(t:n,t0)

=

(22)

where_{t0 is}theofthe time

interval,

nis thenumber of observations in

(0,

t0],

T0,

T\,

...,Tn

arethe observations,

b(n)

is abandwidthfunction that tendsto zero as n > andW is a

bounded,

nonnegative,integrable weightfunction satisfying

f

W(u)du = J-a

In parametric _models, several authors have proposed procedures where the

estimatedrate function is translated as piecewise-linear or piecewise-polynomial forms.

Law and Kelton

(2000)

discuss the piecewise constant method of _modeling an NHPP.

This procedure divides the time axis into nonoverlapping time interval where the rate is

considered to be constant and estimates a single rate foreach interval. Kao and

Chang

(1988)

simulate the times ofcalls foranalysis of electrocardiograms over several days at

a hospital using piecewise-polynomial

intensity

function. Lewis and Shedler

(1976a,

1979)

model NHPP with

log

linearrate function anddegree two exponential polynomial

ratefunction.

Asecond approachis to assumethattherate function hassome specific functional

formthathas a sufficient number of parameters to allow itto fitobserveddatawell.

Lee,

Wilson and Crawford

(1991)

illustrate a generalized model, which uses an

exponential-polynomial-trigonometricfunction

(EPTF)

in theratefunctionas

A(t)

= exp

^r,r'

+ysin(ax+

0)

(23)

, which

they

apply tosimulate offshore weather events in the Arctic Sea

involving

both a

cyclic component and a trend.

Johnson,

Lee and Wilson

(1994)

evaluate the EPTF-type

rate function and conclude that the EPTF can _adequately model an NHPP with a rate

function

having

a _{slowly varying long-term} trend or a cyclic rate component.

Further,

they

demonstrate that the parameters of EPTF-type rate function can be effectively

estimated

by

_{using likelihood} ratio test to determine the degree of polynomial rate

component and

by

_solving the likelihood equations to obtain the maximum-likelihood

estimates ofthe continuous parameters ofthe ratefunction.

Kuhl,

Wilson and Johnson

(1997)

further generalize this model to include multiple cyclic effects.

They

propose a

parametricmodel, anEPTMP-typeratefunctionoftheform

A(t)

-exp

J^af

+rtsin(fly

+

0t)

t=i

to model Poisson process

having

trends or multiple periodicities. This model was

implemented in a large-scale simulation model

(ULAM)

of the organ procurement and

transplantation network

(Pritsker,

1998). Kuhl and Wilson

(2000)

formulated and

evaluatedan _{ordinary least} squares

(OLS)

procedurefor estimating the parametric

mean-value function of a NHPP and demonstrated that the estimation _accuracy and

computational _efficiency ofthe OLS procedure in this type of application is reasonable.

During

the evaluationprocess,

they

use an approximate likelihoodratio test to determine

degree of polynomial. Kuhl and Wilson

(2001)

derive a nonparametric technique titled

multiresolution procedure for _estimating the

(cumulative)

mean-value function of a

(24)

may not have familiar trigonometric characteristics such as _symmetry over the

correspondingcycles.

The next section focuses on a nonparametric method developed

by

Kuhl and

Wilson

(2001)

referred to as a multiresolution method _{for estimating} the mean value

function of an NHPP from observed data where the process _may exhibit behavior

including

a

long

term trend over time as well as multiple

(nested)

cyclic effects. Since

this proposal entails _automating thisprocedure (Chapter

4)

and

implementing

it in a

web-basedenvironment(Chapter

6),

thismodelis discussed in detail.

3.3 MultiresolutionProcedurefor

Modeling

NHPP

The multiresolution procedure (Kuhl and

Wilson,

2001)

is used to estimatethe

mean valuefunction ofanNHPP

having long

term trendormultiple periodic components

orboth. The mean value function is fit dataobserved overthe time interval

(0, 5],

where

the _{data may} contain a

long

term trend or _p cyclic effects

having

periods of

length,

bi(i=\,2,...,p),

respectively. The observed arrival process must _statisfy the

following

two

assumptions:

Assumption 1. There are_{p distinct} cyclelengths

(periods) >,

>

b2

> >

bp

suchthat

bt

is

an integral multiple _of

bi+]

for i-l,2,...,p-l.

Moreover,

the time horizon

(0,S]

is taken

suchthatS isandintegralmultiple ofbj;andlet

bo

- S.

Assumption 2. Within thej"1 cycle

[(j

- 1)&,. ,;',.

)

of length

bi (

i =

1,2,

..,p) , the arrival

rate at time tisproportional toa singlebaseline function ofAjs), where s = _t

-(j

-1)

hi

is the offsetfrom the

beginning

_ofthe cycle so that

Mf)=aijAi(t-(j-\)bi)=aUjAi(s)i

and

{

alt :I -

1,2,...,

S

(25)

Toestimate themean value functionpif), thefunction

/?,

(s)

at each resolution for

sg

[0,6,)

and i=

0,1,2,...,

p needs to be estimated first. Note that at each _resolution,

,(0)

=

Rt(0)

= ₀ _and

R.(bd

=

/?,(,)

= 1 for i=

0,1,2,...,

p . At resolution

0,

a

monotonically

increasing

function R0(t),t

[0,5],

is fitted to the points,

{

[jb^NUb^NiS)]1

:j =

0X...,S/bl

}

such that

/?0(0)

=

and

R0(S)

=l.

Forresolution i for i =

1,2,...,

p

-1, amonotonically

increasing

function

/?,

(5)

is

constructed to estimate the cumulative proportion of arrivals that are expected to occur

during

the first 5 time units of the _cycle, where se

[0,>()

. That

is,

for resolution

I,

Ri

(s)

is fittedtothepoints <

[jbM

,

G, (jbM

)]T

:

j

=

0,1,...,&,

/

bi+l

>, where

(.sb.yi

1 ^o,)-i

1

N(S) tt

forall se[0,bt).

At resolution p, let

rk

: k =1,2,... A/

(5)

denote _the _{observed arrival} _times _and

define

rjk

=rk

modbp

for k

-1,2,...,

N(S)

. Let

rj(k)

: fc =

1,2,...,

N(S)

_denote _the

ordered arrival times on theinterval [0,fe

)

. A monotonically

increasing

function is fitted

tothepoints

j

[77(t),*/iV(5)]T:fc =l,2,...,Ar'(5)|.

(26)

fi(t)

=

N(S)QQ(t)

for re

[0,5]

wherethefunctions {Q.

(t)

:i = p,p

-1,...,1,0}

aredefined

iteratively by

Qp(t)

=

Rp(t-(jpa-l)bi),

andfor i =

p-1,...,1,0

Qiit)

=

Rl((jl+u-V)bM-(jit-\)bl

+

[A(7,+iA>

_-ih,

-l)b,-R,(ji+u -l)bt+1 -(;,, -V)bt]QM(t)

where ,is aunique

integer)

suchthat

(j

-\)b{

<t <

jbt

.

To illustrate the multiresolution _procedure, an example of arrival process whose

instantaneous arrival rate over a time horizon of 28 days contains two cyclic effects

including

weekly and

daily

cyclic effects that

is,

periodic rate componentswith periods

of 1 week and one

day

is considered. The NHPP is constructed so as to _satisfy all the

assumptionsdiscussed intheprevious_part, where _p(5)= 216 _and

Resolution0:

R0(s)= 0.0357 143s se

(0,28]

Resolution 1:

R^s)

= 0.2582035 -0.0164780

s2

(27)

Resolution 2:

R2(s)=

1.832005-0.890900

s2

+ 1.02624s4

se

(0,1]

Dataforrealization of arrival process is generated _usingthe simulation algorithm

described

by

Kuhl and _{Wilson (2000). Application} of the multiresolution procedure

started

by

_estimatingthefunction

Rt(s)

_{corresponding}toeach resolution ifor i-0,l,...,p.

Since the NHPP defined above consists of two periodic _components, the procedure

estimated the function

R{(s)

at:

(a)

resolution

0,

_{corresponding} to the long-term trendor

with a period of 28

days; (b)

resolution

1,

_{corresponding} to the cyclic rate component

with a period of

b\

=7

days;

and

(c)

resolution

2,

_{corresponding} to the cyclic rate

component with a period of

b2

= 1 day. While estimating

Rt(s),

_the _degrees _of

polynomials are specified

by

theuserand assumedto be known.

For estimating mean-value

function,

the points of importance at resolution 0 are

those that correspond to the cumulative fraction of arrivals observed at the end of each

week. At resolution

0,

1- degree polynomial was fitted to thefraction of all arrivals that

were accumulated at the end of each week. The estimated function at resolution

0,

is

shownin Figure 3.1. Thepoints ofimportanceat resolution 1 arethosethatcorrespondto

the cumulative fraction ofarrivals observed at the end of each

day

(that

is,

at the end of

each resolution-2 cycle). In this example, a quadratic function was fitted to the

cumulative fraction of arrivals observed at the end of each

day

of the week. The

estimated function at resolution1 is shown in Figure 3.2. The next _{step in} the procedure

(28)

example, all the arrivals areimportant points. A fourth degree polynomial is fitted to the

observed arrivaltimes. Theestimatedfunctionat resolution 2 is shownin Figure 3.3.

Thefittedpolynomials at each resolution are:

Resolution 0:

Ro(s)= _0.0357143s _se

(0,28]

Resolution 1:

Ri(s)= _0.260803665s-0.0168495 s2

se

(0,7]

Resolution 2:

R2(s)= _{1.872994185-1.32037175}_{s2-0.183941s3+} _0.631318599 s4

se

(0,1]

When all the functions are estimated, next task was to estimate the mean value

function. Theprocess of_estimatingthe overall mean valuefunction started

by

_estimating

themean value functionatresolution

0,

po(t),forte

[0,28]

. Tocompletethis,

A>(f)=

N(S)

G0,o(0

=

N(S) R0(t)

forallte

[0,28]

wascalculated.

Afterthecalculationof resolution0component, nexttask wastoincludethedetails

of the _weekly periodic component to the estimated mean value function. This was done

(29)

Applying

the _{equations, the}

following

_{estimate of mean value} _function

was acquired for

thefirstweek:

Mt)

=

N(S)

<2o,i(0=

N(S) /?o(7) Q,,i(0

=

N(S) R0(7) Ri(t)

_for _all _re _[0,28],

The final step was to include the details of the

daily

periodic component to the

estimated mean value function _calculating_/i2(t), for te [0,28].

Applying

the mean value

function equations, the

following

estimate of mean value function was acquired for the

first day:

/u2(t) =

N(S)

Q0,2(r)

=

N(S) R0(7) Qh2(t)

=

N(S) RoO) Ri(l) Q2,2(t)

=

N(S)RQ(l)Rx(\)R2(t)

forall re [0,28]. Themean valuefunctionwas calculated at resolution 1 forsubsequent

weeks and at resolution 2 for subsequent days. Since the process consists of _only two

periodicities, the final estimate of the mean value function was p,2(i). _{The fitted} mean

valuefunction plotted against the observed cumulativenumber of arrivals overthe entire

(30)

14

Time(Days)

21 28

Figure 3.1 Estimated Linear Function atResolution 0

CD

>

<

"o

O

CO

CD

E

O

0.5

3 4

Time(Days)

(31)

0.2 0.4 0.6

Time(Days)

0.8

Figure 3.3 Estimated Quartic Function atResolution 2

Kuhl and Wilson

(2000)

appliedthe method ofinversion to generate realizations

ofthe fitted function while _{simulating NHPPs}

having

a multiresolution type mean value

function.

Clearly

the _efficiency of this simulation scheme depends on the numerical

methods applied to invert estimates of the functions. A nonparametric estimate of the

mean value function of an NHPP

having

long-term trend or cyclic effects that _may

exhibit nontrignometric characteristics canbe obtained

by

the multiresolutionprocedure.

Themultiresolution procedure allows us to model asymmetric periodic behaviorwith no

(32)

14

Time(Days)

21 28

Figure 3.4EstimatedMean ValueFunction (smooth curve) Superimposedonthe

Observed Cumulative Arrival Function

(Step function)

3.4 Estimation Procedures

In order to automate the multiresolution procedure described in the last section,

this research designs a method, which includes a likelihood ratio test (Chapter 4).

Therefore,

we will _review, in this section, maximum likelihood estimation, hypothesis

testing, andlikelihoodratiotests.

3.4.1 MaximumLikelihood Estimation

One of the most _commonly used estimation method is maximum likelihood

estimation. Maximum likelihood estimation has been described as the originaljackknife

becauseof_{its applicability}toawide _varietyof problems

(Breiman,

1973). Themaximum

(33)

space i^thatwillmaximizethevalue ofthelikelihoodfunction. SupposethatX=

(X\,

...,

Xn

)

are random variables with joint

density

or

frequency

function

f(x:0)

where 0e y/.

Given theoutcomesX=

x,thelikelihoodfunction is

L(0;x)=f(x;0)

(Casella and

Berger, 1990, Ross,

1989). The distinction between these two functions is

the variable that is considered fixed and the one that is varied. When the pdf

/

(x;0)

is

considered, 0 is considered as fixedandxas the _variable; when thelikelihood function

L(0;x)

is considered, xis consideredasfixedand 0 asthevariable.

If the data range does not depend on the

data,

the parameter space \ffis an open

set, and the likelihood function is differentiable with respect to 0-(0\

0P)

over y/,

then themaximumlikelihoodestimate

(0)

satisfiestheequations

d\n(L(0))

=0 for k =

1,2,....,

p .

d0k

These equations are called likelihood equations and

\n(L(0))

is called log-likelihood

function

(Knight,

2000).

There are twoinherent drawbacks associated with thegeneral problem of

finding

the maximum of a

function,

and hence of maximum likelihood estimation. The first

problem is that of _actually

finding

the global maximum and _{verifying that,}

indeed,

a

(34)

arise. The second problemis that of numerical sensitivity. It is sometimes the case that a

slightlydifferent sample will produce a

totally

different

MLE,

_{making its}use suspect.

In some _cases, function

R(0)

of parameters 0 needs to be estimated. In such

situation, one of the most important properties of maximum likelihood estimators what

has come to be known as the invariance _property of MLE is applied. The invariance

property states that if 0 is theMLE of

0,

then _{for any function}

R(0),

theMLE of

R(0)

is

7?((9)(CasellaandBerger,

1990).

Maximum Likelihood estimation provides the theoretical basis on which

statistical tests such as hypothesis tests (and in particular likelihood ratio _tests) can be

developed.

Therefore,

the idea ofhypothesis

testing

is discussed first in the next section

andthen thelikelihoodratiotestis elaborated.

3.4.2 Hypothesis

Testing

A statistical hypothesis is a statement about the parameters of one or more

populations andthe

decision-making

procedure about the hypothesis is called hypothesis

testing. Statistical hypothesis

testing

can be divided into six steps. _{The first step in any}

type ofhypothesis

testing

is to

identify

the parameterofinterest. In the currentproblem

domain the diameter of rod is the parameter ofinterest. The second _{step is} to state the

null hypothesis andto _specify the alternative hypothesis. In the third step, a significance

level is selected. As the significance level

increases,

the precision in the decision

decreases.

Here,

the significance level is the probabilityof _obtaining accurate estimation

.In the fourth step, an appropriate test statistic is determined. In the next step, the test

(35)

hypothesis

testing

comprises whether to accept or reject the null hypothesis and to

representthatintheproblem context.

A widely used method of test construction is the likelihood ratio principle.

Statistical methods obtained

by

_{using likelihood} ratio principle often turn out to have

smallest type II error _among all tests thathave the same typeI error_probability

(Nelson,

1995).Thesetests are called aslikelihoodratiotests(LRT).

3.4.3 Likelihood RatioTests

The likelihood ratio method of hypothesis

testing

is related to the maximum

likelihoodestimators, andthe likelihoodratio tests are as _widelyapplicable as maximum

likelihood estimation. Recall that if

X\,

.

..,Xnis arandomsample from a population with

probability distribution

function/(jc;(9),

thelikelihood function is definedas

L(0;xx,...,xn)

=

L(0;x)

=

f{x\0)

=

f[f(xl;0).

i=i

The likelihoodratio statistic_%is definedas

Sup

L(0;

x)

*(*)= ""

SupL(0;x)

v

where

L(6;x)

is a likelihood function. A likelihood ratio test is

HQ:0Ey/x

versus

H.:0Ey/[

where y/=

(36)

critical ratio value c and_accepting_{if %(x)}> _c _(Casella _and

Berger,

1990, Brieman,

_1973).

Even if the two suprema of

L(0,x),

over the sets _y/\ and _y/, can not _{be analytically}

obtained,

they

can _{usually be}computed numerically.

Therationale _{behind LRT may best be}understoodinthe situation in which

f(x;0)

is

the pdf of adiscrete random variable. In this case the numerator of_{%(x) is} the maximum

probabilityofthe observed_{sample, the}maximum

being

computed overparametersinthe

null hypothesis. The denominator of _{%(x) is} the maximum _probability of the observed

sample over all possible parameters. The ratio ofthese twomaxima is small ifthere are

parameter points in the alternative hypothesis for which the observed sample is much

more

likely

than_{for any}parameter pointin thenull hypothesis. In _{this situation, the}LRT

criterion says

Hq

shouldberejectedand

H\

accepted astrue

(Knight,

2000).

In this thesis work, alikelihood ratio test is designedto determine the degree of a

polynomial. Thistestautomatesthemultiresolution procedure.

3.5Web-Based Simulation

After _reviewing web based simulation research papers from

1996-2001,

it is

observedthat the speedin which the web based simulation is progressing, andthe desire

to create complex simulation models and perform simulation operations on web will not

be possible inimmediate future.

However,

According

to Kuljis and Paul

(2000),

there is

certainly promise for certain type of applications to utilize the advantages of Internet

computing andthe web. Themost

likely

candidates are applications that deal with huge

quantities of data such as meteorological modeling; applications that enable users on

(37)

customer (forexample,

manufacturing

which offers customized_products) or applications

ineducation and

training

which

increasingly

_have _to_cater_to_distance

learning

students.

As far as web based simulation is _concerned, there are _currently two main

approaches that are both based on the Java language. One approach develops a new

simulation language _using

Java,

and the other approach ports an _existing simulation

language,

such as

GPSS,

and creates _{it in Java (Whitman 1998). Advances in Java}

developmenttools andJavaBeans

technology

thatcanfacilitate developmentsofreusable

components in combination with the _ability to distribute and execute models via the

Internet may foster increased activity in the development of

high-level,

domain-specific

simulation toolsaimed attheend users.

The integration ofthe web and Javarepresents a technological advancement that

enables a

fundamentally

new approachto simulation modeling.

Healy

(1998)

attributesto

the role, the Java language plays in both the specification and implementation of the

model. Paul

(2000)

explains how Java is beneficial for web-based simulation. Kilgore

(1998)

lists Java features thathave the potential to

dramatically

change the processused

tobuildcomputer simulation models.

Kuljis and Paul

(2000)

discuss several Java based discrete simulation

environments and conclude that developments are _surprisingly conservativeand lack the

veryentrepreneurial nature of media

they

are

trying

to conquer.Most ofthe workfocuses

on the tools that are

being

used

(Java,

Javabeans etc.) rather than how the new medium

can enhanceour use of simulation as a_modeling vehicle. As it

is,

most applications are

"invented"

and answer the question "

What can I put on the

web"

(38)

simulation has to be available on _{the web,} how can I design to facilitate the needs ofits

users."

The main architectures of web-based simulations have been shown and tested

successfully

during

recent years (Fishwick

1996,

Healy

and Kilgore 1997). Wiedemann

(2001)

compares common web technologies with simulation characteristics as shown in

Table 3.1. The comparison reveals large differences _concerning all criteria. A simple

transformation of actual simulation techniques into the web will result in the same

disadvantages plus some new restrictions and problems caused

by

the web itself. The

main cause forthe differences consiststhe _{variability in} the differenttypes of operations.

A typical Internet surfer _essentially works in a _passive, _assimilating mode. His creative

workis limitedtodecisions about_selecting thenext

interesting

linkor newcombinations

of search criteria. A wrong decision can be promptly corrected

by

_pressing the "Back"

button.

Modeling,

simulation and result analysis require a high level of active and

creative work. A problem of web based simulation systems is the _{high complexity} of

theiruserinterfaces.

Fishwick

(1996)

has divided the web based simulation in three topics:

1)

Education and

Training

2)

Publications

3)

Simulation Programs anddiscusses procedures

for

implementing

web technologies and concepts in simulation modeling. Yu-hui Tao

and

Shin-Ming

Guo

(2001)

designedawebbased

training

systemforsimulationanalysis.

According

to

Pidd, Oses,

Brooks

(1999),

various forms of distributed simulation are

possible over the world wide web,

including

simple multiple replications of the same

model, clientserver architecturesforoneormore _{simultaneously}_runningmodels andthe

(39)

component-based approach for web-based simulation. Alfonseca

(2000, 2001)

has built

andexecuteddistributedweb basedmodels.

Table 3. 1. ComparisonofWeb andSimulation

Technology

Characteristics

Common webtechnologies Traditionalsimulation

technologies Common

Standards

Yes

(HTML, XML, TCP/IP)

Nocommonstandardformodel and

results

Data

handling

By

uniqueURL

Proprietary

Information

Structures

Treestructures andlists

Very

complex

Specialist

Knowledge

No

(only

_clickingand

reading)

Yes (fromalotof scientific _areas)

Navigation

Easy

Difficult

Ease ofUse

Very

good

Very

difficult

Typeof Operations

Readand analyze

information

Synthesizemodels and analyze

results.

Seila and Miller

(1999)

present a design for scenario managementin web based

simulation. The objective of the design is to perform output analysis on the web. The

JSIM environment provides the requirements for scenario management and output

analysis. The web based interface created

by

Guru, Savory,

Williams

(2000)

connected

SEVIAN server, Oracle server andApplication serverto allow users to store and execute

SEVIAN simulation models overtheInternet.

Bodner,

Govindraj (2000)

connected

Arena,

CORBA and JAVA applets to design a web-based simulation environment to support

modelinganddesignofmanufacturingand material

handling

systems.

Current Java basedsimulation languages are Simjava (Howell and McNab

2000),

DEVSJAVA (Sarjoughian and Zeigler

1998),

JSIM (Miller

1998), Javasim,

JavaGPSS

(40)

environment) combines web

technology

with the use ofJava and CORBA (Iazeolla and

D'Ambrogio 1998). Most of the

languages

are Java versions of _existing simulation

languages.

The concept of open source

modeling

(Wiedemann, 2001)

represents an

opportunity to improve the _quality _{of common core simulation}

functions,

_improve _the

potential for _creating reusable _modeling _components

using

XML,

HLA and other

simulation _community standards. It aims at

leveraging

the unique communication and

distribution opportunities created

by

the Internet to open the development of simulation

softwareto aworldwide_communityoftalentedsoftwaredevelopers andresearchers. This

conceptis at_{extremely early}stage ofdevelopment.

In contrast, current object oriented commercial simulation languages are capable

of _{modeling any} type of complex systems. Benson

(1997)

illustrated ProModel as a

powerful yet_easy to use simulation tool_{for modeling} all types of_{manufacturing} systems

ranging from small job shops and _machining cells to large mass _production, flexible

manufacturing systems, and _supply chain systems. Most of the languages are windows

basedsystems withintuitive graphical interfaces and object-oriented_modelingconstructs

that eliminate the need for programming. Swets and Drake

(2001)

have introduced the

new Arena product

family,

which allows the modeler to create

discrete,

continuous as

well as combinedmodels. Arenacombinesthe ease of usefound in high-level simulators

with the

flexibility

of simulation

languages,

and even all the _{way down} to general

purpose procedural languages such as the Microsoft _{VB programming} system orC ifthe

user needs. The hierarchical structure of Arena allows user to shift from lower level

(41)

discussion,

it can be concluded that current commercial _{modeling languages} are more

user

friendly

as compared to web-based simulation environments.

They

allow the

modeler to construct complexities in the model and the level of

flexibility

is high for

animation, output analysis and other simulation_studyrequirements.

InChapter 5 andChapter

6,

a web-based simulation infrastructure is demonstrated

utilizing current web technologies and current web-based simulation languages. This

(42)

4. AN AUTOMATED

MULTIRESOLUTION

_PROCEDURE _FOR_MODELING NHPPS

The objective of this chapter is to

develop

a method for _automating the

multiresolution procedure for _modeling NHPPs. The multiresolution _procedure, as

discussed in Chapter

3,

requires several components to

fully

automate the procedure

including

the selection and estimation of a function to be fit at each resolution. In

summary to automate the _procedure, a special form of a polynomial is selected to fit at

each resolution. The polynomial at each resolution is estimated _using _{ordinary least}

square procedure (Kuhl et al. 2000). A likelihoodratio test is designed to determine the

degree of each polynomial. The likelihood ratio test requires the

data,

which have

normalized residuals.

However,

Kuhl and Wilson

(1997)

show that the data observed

from anonhomogeneousPoisson processdo notpossessnormalized residuals.

Therefore,

to _{apply likelihood} ratio _test, the original observed data from the nonhomogeneous

process is transformed _using an arcsin square root transformation to obtain data with

normalized residuals. This transformed data is utilized _only to determine the degrees of

polynomials. Once the degrees are

determined,

these degrees are used to estimate

polynomials from original data. These polynomials are utilized to calculate the mean

value function to representthe NHPP. The derivation _along with a demonstration ofthe

automatedmultiresolutionprocedure with thelikelihoodratiotest is discussed below.

4.1 Input

Modeling

and theNeed for Automation

In thewords ofFriedrich Engels (1820-1895), "An ounce of actionis worth aton

of

theory."

This statement can be applied effectively to many fields of research. In the

(43)

input modeling,

however,

without a method developed for

implementation;

the model

may not be used in actual practice. Due to the high cost and large amount of time

required for

implementation,

the use of a model _may _{become economically} prohibitive.

Kuhl and Wilson

(2001)

establish the theoretical basis and general _{methodology for}

modeling NHPP's usingthemultiresolution procedure.

However,

fora simulation analyst

to _apply the multiresolution procedure in practice, the analyst must complete the

following

tasks:

1. Obtain an appropriatefunction thatfollowsthe theoretical requirements;

2. Obtainan appropriate methodfor

fitting

thefunctionto dataateach_resolution;

3.

Verify

that themodel meetsthe theoretical assumption ofthe procedureas well asthe

assumption ofan

NHPP;

4. Validatethe model;

5. Writecomputer code forthe

fitting

procedure; and

6.

Apply

themodel to thesystem under study.

Therefore,

this research work

develops,

tests and implements an automated procedure

that will _satisfy the first 5 criteria,

leaving

the user_only with _applying the model to the

systemunder study.

4.2Selectionof an Appropriate Function for

Rt

()

Recall that in the multiresolution procedure, at each resolution level

i,

an

estimator

Ri

(s)

ofthefunction

Rt

(s)

mustbeobtained withthe

following

properties: the

initial value /?,.

(0)

=

Ri

(0)

=

0;

_the final value

Rt

(bi)

=

Ri

(bt )

=

1;

_and _the derivative

R'(s)

>0 for se

[0,

bt]

and i =

0,

1, 2,

(44)

At resolution level

0,

a _{monotonically}

increasing

function

R0(t),

re

[0,5]

is

fitted,

to the points

{

[jbx,

N

\jbx)

I N'(S)]r

:

j

=

0, 1,

...,

Slbx

}.

At resolution level i

for / =

1, 2,

..., p

-1, a monotonically

increasing

function

Ri

(s)

is _constructed _to

estimate thecumulative percentage of arrivals thatare expectedto occur

during

thefirsts

timeunits ofthe associated cycle oflength

bn

where s e

[0,

&,.]

. That

is,

forresolution

level

i,

the function

Rt

(s)

is fittedto the points

{

[

jbM,

GZjb^)]1 _:j=0,

1,

...,

bi lbM

},

where

G,(s)

=-i-

[/V(^+s)-W>.)]

(1) N (5

)

_?=0

forall _se[0, &.].

Atresolution_p, let

{rk

: k =

I,

2,

...,

N(S)

}

denotethe observed arrival times;

and define

r/k

_-rk mod

bp

for k -

1,

..., N(S). Let

r/w

<

<^(i)

denote the ordered

arrival times within the level-/? cycle

[0,

bp

]

. Then a monotonically

increasing

function

Rp(s)

is fitted to the points

{[r/(k),

k/N(S)]T

:k =

l,

...,

N(S)}

subject to the usual

boundary

conditions that

7^/,(0)

=

0and^p(Z?/))

=l. The final _estimate

ju(t)

ofthe mean

valuefunction is computedasfollows:

(45)

wherethefunctions

{2,.(r)

: i =

p, p-l, ...,

1,

o}

aredefined

iteratively

by

QP(t)

=

Rp(t-(jpJ-l)bp),

andfor /=

/?-!,...,1,0, take

Q,(t)

=

Ri((j,+l,,-i)bl+l-(jil--i)bi)

+

lR,(Jl+l,,bM-Ui,<-Vbl)

-^(O",-+i,-1)^+1-O'1-,-1)^)]qi-+i(0,

(2)

wherethe _{interval containing}tis within each resolution isgiven

by

the

index,

y,,isthe uniqueinteger

j

such that

(j-\)bi

<t<

jbt

.

In order to construct the mean value function utilizing the multiresolution

procedure, the functions {R)(s): i =

0,1,2,...,/?}

_at _each _resolution _must _be _selected.

Therefore,

a special form of a polynomial that meets these requirements _namely, a

degreerpolynomial oftheform

Ri(s)

=

sib,,

E/M'/M'

+

fi-I^

r-1 ( r-1 A

k=_\

if r =

1,

(s/bi)'

if r _>1,

(3)

(46)

for se

[O.fcJ

is utilized for i =

0,

1,

2,

...,p. The coefficients

{pk

:k=

\,

...,k-l

}

in

(3)

are constrained to yield #,'(s)>0 forall s e

[0,

fc.]. This polynomial _{automatically}

satisfies the required

boundary

conditions.

Further,

the simplest process that can be

modeledis theone with a constant arrival rate overtheinterval

[0, bt],

which corresponds

to the simplestformofthemodel

(3)

wherein r= _{1 is}_taken _(that

is,

_a_linear _mean value

function).

4.3

Estimating

/?.()

To estimate the function

(1)

by

a fitted function

R,

(s)

of the form

(3)

at each

resolution level

i,

the appropriate degree rofthe polynomial

(3)

must be determinedand

then the associated polynomial coefficients p\,

p.,

...,

/3r-\

are estimated. The standard

approach for solving such a problem is to perform regression analysis _using a forward

selection orbackwardelimination procedure (DraperandSmith 1998).

In a forward selection procedure, polynomials of degree r- 1

and r are fitted to

the sample

data;

then the mean square erroris computed foreach fit and an appropriate

likelihood ratio test is performed to determine if significant improvement in the fit is

achieved

by

increasing

the degree of the fitted polynomial from r - 1 to

r. If a

significantly better fit is obtained withthedegree-rpolynomial, then the

fitting

procedure

is continued

by

incrementing

the value of r and _repeating the likelihood ratio _test;

otherwise, the

fitting

procedureis terminated,

delivering

r- 1

as the degree ofthe fitted

(47)

However,

the assumptions of classical regression analysis require observations of

adependent variable thatare independentand normal with a constant variancesothat the

corresponding residuals are independent and

identically

distributed normal random

variables. In the case of

fitting

arrival data obtained from an NHPP that satisfies

Assumptions 1 and

2,

the observed cumulative relative frequencies within each basic

cycle are neither normal nor

independent;

_moreover, such responses do not possess a

constant variance.

Therefore,

a variance _stabilizingtransformation mustbe appliedto the

data before standard statistical procedures to determine an appropriate degree r for the

function

R(

(s)

are used.

4.4 Likelihood Ratio Test Procedure

At each resolutionlevel

i,

let mt =

bjbi+x

-1 for

0<i<p

-1 and _mt =N(S) for

i =_p; andapolynomial function

Rt

(s)

ofthe form

(3)

is fitted with appropriatedegree r

toaset of points

having

thegeneralform

(Zj,W.)T:j=l,...,m.l (-*)

asdefinedin 1. Forexampleiftheresolutionlevel iis in therange

{l,

...

,p-l}, thenat

the;thpointin

(4)

theabscissa

Zj=jbM

andthe ordinate

Wj=Gi(jbl+l)

for;=

1,

..., m,

aretaken. Since

Wj

is alwaysa proportion, thevariance_stabilizingtransformation

Yj^sm^ljW^l

for ;=L ...,/n,.

(48)

is exploited. See pp. 231-238 of

Box, Hunter,

and Hunter

(1978)

and pp. 291-294 of

DraperandSmith (1998).

Corresponding

tothe dependentvariable Y. defined

by

(5),

theindependentvariable

X . =

Zj/bi

=

jbM/b;

for

j

=

1,

..., mi ;

(6)

is definedandinterms ofthevectorofregression coefficients

c

if r =

u

'"[[C1,C2,...,Cr_1],

if r>2,

(7)

forthe degree-rpolynomial

T \u;C =

^

w u,

r~x k

k=_\

--I

c, 2 7 1 *

v fc=l y

r

ifr=l,

ifr>l,

(8)

The

following

statistical model for yyasafunction of

X;

(49)

is postulated. If

(9)

is valid, then the transformation

(5)

yields _{approximately} normal

residuals with constant variance a2

sothat

{ej-.j

=

l,

...,m,\

~ N(0,a2).

(10)

Noticethatin

(8)

Tr(0;Cr)

= 0 _and _Tr

(1;

C

r) =n/2.

(H)

For fixed r (r =

2, 3,

...), the estimator

Cr

of the coefficient vector

Cr

is the

solution ofthe

following

constrainedleast-squaresregressionproblem:

m.

minC y

ri=i

Y.-T

\X

,C

J n J r

(12)

subjectto

(50)

Observethat

r;(:cr)=f

^u^+rf^-f

C

*=1 V 2 4=1

fr r-1 >

ur~\

(14)

so that

(13)

canbe interpretedas _requiringthezeros ofthedegree-(r

-1)

An Automated procedure for simulating complex arrival processes: A Web-based approach

Rochester Institute of Technology

Recommended Citation

AN AUTOMATEDPROCEDUREFORSIMULATING

Engineering

M.S. DEGREE THESIS

Permission granted

ACKNOWLEDGEMENTS

University

Technology

ABSTRACT

industry,

NHPPs,

following

Finally,

TABLE OF

CONTENTS

1. INTRODUCTION 1

STATEMENT

3.LITERATURE REVIEW 7

Modeling

4. AN AUTOMATED MULTIRESOLUTION PROCEDURE FOR MODELING

Modeling

5. EXPERIMENTAL PERFORMANCE EVALUATION 55

7. WEB-BASED

SIMULATION

REFERENCES 120

Fitting

1.INTRODUCTION

designing

involving

developing

including

heavily

Model,

fitting

fitting

future,

2.PROBLEM STATEMENT

having

Therefore,

finally

literature,

3.LITERATURE REVIEW

by developing

(Banks,

(theory

theory

Validity

theory

being

Therefore,

following

Cinlar, 1975, Bhat,

frequently

interval,

Chang

involving

+rtsin(fly

During

including

having

beginning

during

including

days; (b)

Applying

Time(Days)

having

Therefore,

frequency

(Knight,

7?((9)(CasellaandBerger,

decision-making

testing

testing

Berger,

likely

However,

increasingly