Rochester Institute of Technology
RIT Scholar Works
Theses
Thesis/Dissertation Collections
11-1-2003
An Automated procedure for simulating complex
arrival processes: A Web-based approach
Sachin Sumant
Follow this and additional works at:
http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please [email protected].
Recommended Citation
Rochester Instituteof
Technology
AN AUTOMATEDPROCEDUREFORSIMULATING
COMPLEX ARRIVAL PROCESSES:
A WEB-BASEDAPPROACH
A Thesis
Submitted inpartialfulfillmentofthe
requirementsforthedegreeof
MasterofSciencein Industrial
Engineering
inthe
DepartmentofIndustrial & Systems
Engineering
Kate Gleason Collegeof
Engineering
by
SachinG.Sumant
B.E.,
ProductionEngineering,
V. J. T.I., University
ofBombay,
2000DEPARTMENT OF INDUSTRIAL AND SYSTEMS ENGINEERING
KATE GLEASON COLLEGE OF ENGINEERING
ROCHESTER INSTITUTE OF TECHNOLOGY
ROCHESTER, NEW YORK
CERTIFICATE OF APPROVAL
M.S. DEGREE THESIS
The M.S. Degree Thesis of Sachin G. Sumant
has been examined and approved by the
thesis committee as satisfactory for the
thesis requirement for the
Master of Science degree
Approved by:
Dr. Michael E. Kuhl, Thesis Advisor
Thesis Reproduction Permission Statement
Permission granted
Title of thesis
AN AUTOMATED PROCEDURE FOR SIMULATING COMPLEX ARRIVAL
PROCESSES: A WEB-BASED APPROACH
I, Sachin G. Sumant, hereby grant permission to the RIT Library of the Rochester
Institute of Technology to reproduce my thesis in whole or in part. Any
reproduction will not be for commercial use or profit.
ACKNOWLEDGEMENTS
I am short of words to express my gratitude towards Dr. Kuhl. Not only has he been a
phenomenal academic advisor, but also a mentor, assisting me on several occasions not
directly
relatedtomy academic career.I would like to thank Dr. Sudit for serving as a member on my thesis committee. I am
grateful to Dr. Wilson from North Carolina State
University
for providing guidance andvaluable inputs related to this research. I wish to thank Mr. Madhu Nair for his assistance
during
mythesis work.Specialthanks toDr. Mozrall for hersupport andencouragement
during
theproject.I appreciatethesupport ofmy friends whohave beenexcellent companions andIthankthem
for making my stayatRochester Instituteof
Technology
enjoyable. This workis dedicatedtomyparentsandmy elderbrother for motivating me tocome sofarand for
teaching
meto beABSTRACT
In
industry,
simulationisone ofthemost widelyused probabilistic modelingtools formodeling
highly
complex systems.Majorsources ofcomplexity include the inputsthatdrivethe logic ofthe model. Effective simulation input modeling requires the use ofaccurate and
efficient input modeling procedures. This research focuses on nonstationary arrival
processes. The fundamental stochastic model on which this study is conducted is the
nonhomogeneous Poisson process
(NHPP)
which has successfully been used tocharacterizearrival processes where the arrival rate changes over time. Although a number of methods
existfor modelingthe rate and mean valuefunctions thatdefinethe behaviorof
NHPPs,
oneof the most flexible is a multiresolution procedure that is used to model the mean value
function forprocesses possessing long-term trends over time or asymmetric, multiple cyclic
behavior. In this research, a statistical-estimation procedure for automating the
multiresolution procedure is developed that involves the
following
steps at each resolutionlevel corresponding to a basic cycle:
(a)
transforming
the cumulative relativefrequency
ofarrivals within the cycle to obtain a linear statistical model
having
normal residuals withhomogeneous variance;
(b)
fitting
specially formulated polynomials to the transformedarrival
data;
(c)
performing a likelihood ratio test to determine the degree of the fittedpolynomial; and
(d)
fitting
a polynomial of the degree determined in(c)
to the original(untransformed)
arrival data.Next,
an experimental performance evaluation is conducted totest the effectiveness of the estimation method. A web-based application for modeling
NHPPs using the automated multiresolution procedure and generating realizations of the
NHPP is developed.
Finally,
a web-based simulationinfrastructure that integrates modeling,TABLE OF
CONTENTS
1. INTRODUCTION 1
2. PROBLEM
STATEMENT
53.LITERATURE REVIEW 7
3.1 Simulation Input
Modeling
73.2
Estimating
NHPPs 93.2.1 Poisson Processes 10
3.2.2Nonhomogeneous Poisson Processes 12
3.3MultiresolutionProcedure for
Modeling
NHPP 163.4 Estimation Procedures 24
3.4.1 Maximum Likelihood Estimation 24
3.4.2 Hypothesis
Testing
263.4.3 Likelihood Ratio Tests 27
3.5 Web-Based Simulation 28
4. AN AUTOMATED MULTIRESOLUTION PROCEDURE FOR MODELING
NHPPS 34
4.1 Input
Modeling
andtheNeedfor Automation 344.2 Selectionof anAppropriate Function for R.
()
354.3
Estimating
Rt()
384.4 Likelihood Ratio Test Procedure 39
4.5 An Example
Illustrating
theAutomated MultiresolutionProcedure 485. EXPERIMENTAL PERFORMANCE EVALUATION 55
5.1 GenerationofExperimental Data 55
5.2 FormulationofPerformance Measures 56
4.3 Presentation andAnalysisofResults 63
6. WEB-BASEDIMPLEMENTATIONOFMULTIRESOLUTIONPROCEDURE
FOR MODELINGNHPPS 79
6.1 Application Requirements 80
6.2 Software Selection 81
6.2.1 Static Web Pages 81
6.2.2 Dynamic Web Pages 82
6.3 Application DesignandDevelopment 84
6.3.1 Input Data 87
6.3.2 Estimate Data 89
6.3.3 Output Data 90
6.4 Web-Based Multiresolution Procedure (WBMP): An Example 92
6.4.1 Generate Data 92
7. WEB-BASED
SIMULATION
ENVIRONMENT: A CONCEPT 1027.1 General Issues 102
7.2 The Web-Based Simulation Environment Concept 104
7.3 Web-Based Simulation Environment: Design 106
7.3.1 Web-BasedInput
Modeling
1067.3.2Web-BasedOutput Analysis 109
7.3.3 Web-based Model
Building
1127.3.4Web-basedVerificationandValidation 114
8.
CONCLUSIONS
AND FUTURE WORK 117REFERENCES 120
Appendix A: VB.Net ProgramtoRun anExecutableon the Server 124
AppendixB: Input Data for Web-Based Data
Fitting
Algorithm 126Appendix C: Input Data for Data GenerationAlgorithm 127 Appendix D: Set
Up
Procedure for Web-Based InputModeling
Environment 1281.INTRODUCTION
Simulation is the process of
designing
and creating amodel ofa real or proposedsystem via a mathematical-logical model, conducting experiments on the model, and
drawing
inferences from the model about the operation of thereal system (Kelton et al.2002,
Kuhl 1997). Simulation can be appliedto describe and analyze the behavior of asystem, analyze "What if...?" questions about systems under study, compare alternative
system configurations, optimize system performance, and aid in the design of new
systems. Although other operations research tools such as queuing theory, linear
programming, etc. can also be used for these types of studies and provide closed form
analytical solutions, the simplifying assumptions necessary to use these tools may result
in models thatdo not represent the true behaviorof the system. The powerofsimulation
when comparedto otheranalysis toolsis in the abilitytoanalyze complex systems.
Thus,
in the case of complex systems
involving
variability and interdependencies betweensystem components, simulation provides the capabilities necessary for obtaining valid
solutions.
In general, simulation studies can be divided into six basic processes, namely,
problem
formulation,
creation of a conceptualmodel,creation of a computermodel, inputmodeling, verification and validation, and experimentation. Problem formulation
involves
developing
problem statement,determining
the objectives ofstudy andplanningthe study. After
formulating
the problem, thenext taskis to study the real worldprocessand create a theoretical model, which is called the conceptual model. The conceptual
model defines the aspects ofthe system that should be included in the study as well as
established, a computer simulation model is constructed. The simulation model, like all other modeling techniques, requires inputs. For simulation, the input modeling
procedures encompass all ofthe activities needed to specifytheparameters thatdrive the
simulation model
including
datacollection, analysis, andrandom variate generation. Theverification and validation process consists ofchecking the operation of the model and
determining
ifthe model accurately represents the real system.Finally,
experimentationincludes
designing
andperforming a set ofexperiments, analyzing the simulation outputsand
drawing
theinferences aboutthe system.Of the six processes mentioned above, one ofthe most important and influential
processes in any simulation study is inputmodeling. Input modelingperforms arole of
an interpreterofthereal worldsystemforthe simulation model. When available, dataare
collected from the real system, and probabilistic input models are fit to the data
by
applyingvarious statisticalalgorithms.Then,
themodels areused inthe simulation modeltocreate randomprocessesrepresentingthe system's arrivalprocesses, servicetimes, etc.
Since input modelingprovidesinputs to thesimulation model, theoverallperformanceof
the study depends
heavily
on the accuracy ofthe inputmodels. Although there is alargebody
of research in the area ofinput modeling for stationary processes, thederivation of methods for estimating and assessing the goodness of fit of models applicable tononstationaryprocessesis ongoing
(Banks,
1998).Nonstationary
arrival processes are routinely encountered in practice, and thusbeing
able to generate realizations of nonstaionary processes in simulation models isessential. The arrival of customers to a
bank,
demand forseasonal products such as lawnand the arrival oftelephone calls at call centers, are sometypical examples of situations
wherethearrival rate changes overtime. One inputmodelthathas beenusedsuccessfully
in thepast tomodel a large class nonstationaryarrival processes is the nonhomogeneous
Poisson process (NHPP). The
key
to modeling NHPP's is tocharacterize the arrival rateas the rate changes over time. Both parametric and non-parametric methods have been
developed to model the arrival rate (These methods arediscussed in detail in Chapter 3).
In many simulation studies, nonstationary arrival processes exhibit long-term trends or
multiple periodic behaviors. For example, in analyzing the arrival streams of
liver-transplant donors and patients for the UNOS Liver Allocation
Model,
Pritsker et al.(1995)
observed some arrivalratesto reveal significant growth overtime as well asdaily,
weekly and annual effects. To model such type of NHPPs with
long
term trend andperiodic effects, Kuhl and Wilson
(2000)
derive a flexible theoretical procedure calledthemultiresolution procedure.
Kuhl and Wilson
(2001)
present thetheory
behind the multiresolution procedureandshow examples of how theirmethod works.
However,
these examples require somesimplifying assumptions and require the user to determine the adequacy of the fitted
model. Inthis thesis, statistical procedures are developedthatcanbe usedto automate the
model
fitting
process for the multiresolution procedure without the need for simplifyingassumptions or intervention
by
the user. The statisticalprocedures are implemented in acomputer program, and an experimental performance evaluation of the procedure is
conducted.
The dominance oftheInternet in the development ofinformation and
technology
to disseminate the automated NHPP
fitting
procedure, a web-based input modelinginterface is developed. Basedon a thorough literature review, this is the first web-based
input modeling software of its kind.
Thus,
some unique challenges were encountered inthe developmentof web-basedsystem. Included in theweb-based input modelingsystem
is the ability forthe userto submit a file containing collected data (arrival times) and to
specify parameters of the
fitting
procedure. Graphs of the fitted NHPP and each of thecomponentsofthemutliresolutionprocedure are plotted versustheobserveddata. Output
files ofthe parametersare alsogenerated and can be downloaded
by
theuser. In addition,the web-based system canbeused togenerate realizationofthefitted NHPP thatcouldbe
usedin simulation experiments.
Research intothe area ofweb-based input modeling, has led to the investigation
of a web-based simulation infrastructure that would allow users to conduct complete
simulation studies on the web. The available functions would include the ability to
(a)
upload a simulationmodel, or at some pointin the
future,
tobuild a simulation modelonthe web;
(b)
perform verificationandvalidation overthe web;(c)
conductinput modelingand analysis; and
(d)
conductoutput analysis. The web-basedinput modeling module isdemonstrated through the implementation of the NHPP
fitting
procedure. Although fulldevelopment of each of the fourmain components is beyond the scope ofthis thesis, the
infrastructure for each of the modules will be discussed and demonstrated with simple
2.PROBLEM STATEMENT
The focus of this research is on modeling and simulation of complex arrival
process. In particular, the research examines the multiresolution procedurefor estimating
the mean value function of nonhomogeneous Poisson processes
having
long
term trendsor nested cyclic behavior over timewith the goal of
being
ableto generate realizations oftheNHPPs in simulation experiments. This work extendsthe capabilityand
functionality
ofthe theoretical basis forthe multiresolution procedure established
by
Kuhl andWilson(2001)
by
providing an automated statistical analysis methodology for modeling NHPPthat is implementedin aweb-basedinput modelingand generation software.Theresearch
workhas fourmainobjectives,and each objective has itsown uniquechallenges.
Thefirst objectiveofthis thesisis to automatethe multiresolution procedure. The
multiresolution procedure estimates mean value function of a NHPP. The mean value
function is calculatedfromthefunctions fittedto
long
term trends and all periodiceffects(i.e. to all resolutions).
Therefore,
to automate the multiresolution procedure, the firsttask is to determine a
function,
which should not only fulfill multiresolution procedurerequirements but also demonstrate ability to model complex trends and cyclic effects in
data.
Next,
a procedure has to be determined to estimate the functions at all theresolutions. After this research, a computerprogram needs to be developed and tested.
Finally,
a complex example has to be selected to validate the results of the automatedmultiresolution procedure.
The second objective of this research involves conducting rigorous experiments
circumstances. This experimental performance evaluation includes design of a set of
experiments,determinationof performance measures and
finally
analysisoftheresults.The third objective concentrates on design and development of web-based
multiresolution procedure. This applied research encompasses selection of appropriate
software and
hardware,
designanddevelopment ofthe web basedinfrastructure for input modeling, andfinally
implementation andtesting
of the infrastructure with themultiresolutionprocedure algorithm.
The final objective consists of
developing
a conceptual web-based simulationenvironment.
According
to Kuljis andPaul(2000),
web-based simulation researchers areexploring all new directions but still not able to achieve the full advantages of the
internet. This thesis work studies the latest web-technologies and their capabilities,
analyzes approaches taken in web-based simulation
literature,
andfinally
designs aconceptual web-based simulation environment, which can utilize the benefits of the
Internet benefits to conduct simulation studies. In addition, the simulation environment
3.LITERATURE REVIEW
The focus of this research is to automate the multiresolution procedure for
modelingnonhomogeneousPoissonprocesses and toillustrate web-basedinput modeling
by developing
a web-basedimplementation ofthe multiresolution procedure.Toestablisha basis for the automation of the multiresolution procedure, simulation input modeling
methods
induing
nonhomogeneousPoissonprocesses andtheir characteristics, parameterestimationprocedures, andlikelihoodratio testsarediscussed. A detailed summaryofthe
multiresolution procedure and its characteristics is also provided.
Finally,
the currentresearchinthefieldof web-basedsimulationisreviewed.
3.1 Simulation Input
Modeling
When asimulation model is created, there aremany details thatmustbe specified
in order to define a valid simulation model. These details include simulation model
inputs. Theproceduresto determinetheseinputsarereferred asinput modelingmethods.
Figure 3.1 illustrates the input modeling process
(Banks,
1998). In general, thisinput modeling process is the method
by
which the process andevents that occur in thereal system are translated into mathematical models that will be used in the simulation
program.
Thus,
each simulation model inputcorresponds to aprocess oreventin thereal-world system. The input modeling process requires combination of
data,
relevantapplicable
theory
(theory
of stochastic processes or related to the system underconsideration) and prior experience. The data required for input modeling are collected
from the real world.
Sampling
methods are used if observation data are available. Theapplication of
theory
on data is completely influencedby
the characteristics of theexperience of the practitioner or expert opinion can be valuable in ascertaining proper
understanding of current process, appropriate application of
theory
and valid results.Various input modeling strategies are developedandused tofitthe input model
(Nelson,
1995). To beuseful, theinputmodel thatis developedmustbe bothreasonableand valid.
That
is,
themodel must be statisticallysound and at the same time be able to accuratelyrepresent the system under study.
Thus,
an assessment of the reasonableness of theresulting model andavalidityassessmentoftheresultingmodelmustbe made. Thefitted
inputmodel is then passedto thesimulation program wherethe random variate generator
employs theinputmodel tocreate realizations oftheprocess.
Validity
Assessment
?
Real World Process
Reasonableness
Assessment
Sampling
Data
Theory
Past Experience
Modeling Strategy
[image:16.520.46.476.321.573.2]Input Model
In practice, there are many different types of processes and data resulting in
various modeling strategies.
Here,
the thesis focuses on processes from whichobservational datacan beobtained to which stochastic process models and
theory
can beapplied.
3.2
Estimating
NHPPsAny
process that has a random element is considered a stochastic process.Stochastic processmodels are waysofstatistically analyzingthe dynamicrelationshipsof
sequences of random events (Cox and
Isham, 1980,
Taylor andKarlin,
1994). Randomprocesses occur very often in almost every field of natural and engineering sciences.
Stochastic modeling can be applied to analyze the variability inherent in manufacturing
systems, biological and medical processes, to deal with uncertainties affecting decision
making, and with the complexities ofpsychological and social interactions. In addition,
stochastic modeling can provide viewpoints, methods, models and insightto aid in other
mathematical and statistical studies.
Stochastic processes can be classified as
being
stationary or nonstationary. Aprocess is said to be stationary if the process mean and the process variability remain
constantovertime. A nonstationaryprocess,
however,
changes overtime.In the case of stationary stochastic processes, a random sample of independent
and
identically
distributedobservationscanbe collectedand aprobability distributioncanbe fit to the data. A
three-step
procedure usedto fit theoretical distribution to the data isasfollows:
1. Hypothesize a candidate distribution:
first,
ascertain whether the current process is2. Estimate the parameters of the hypothesized distribution: For example, if the
hypothesis is that the underlying data are exponential, the parameter to be estimated
fromthe data istheexponential rate.
3. Perform goodness-of fit test such as the chi-squared test: If the test rejects the
hypothesis that is a strong indication that the hypothesis is not true. In that case, go
back tostep
1,
orapplytheempirical distribution ofthedata.In the case of nonstationary stochastic process, complete observations of the
processes over the time interval of interest are needed to fit a nonstationary stochastic
model. One model, which has successfully been used to represent a large class of
processes, is the nonhomogeneous Poisson process (NHPP). The NHPP is a
generalization of the stationary process known as the
(homogeneous)
Poisson process.Therefore,
to aidindiscussion,
the properties ofthe(homogeneous)
Poisson process willbepresentedin thenext section priorto
discussing
NHPPs.3.2.1 Poisson Processes
A Poisson process is a particular type of stochastic process referred to as a
countingprocess, pointprocess, orbirthprocess.
Namely,
aPoissonprocessNt
measuresthe number of randomly occurring events that have occurred in the time interval
(0,0-The Poisson process is characterized
by
the average rate of occurrence of events, X. Inaddition, aPoissonprocess hasthe
following
properties:1. Thenumberof events atthe start oftheprocess (time
0)
iszero.2. The process has independent increments. That
is,
the number of customers arrivingduring
one time interval does not affect the number arrivingduring
a different time(non-3. The average number of events that occurin an interval oftime is proportional tothe
lengthoftheinterval.
4. Customersarrive one at atime.
Formally,
theseproperties havethefollowing
mathematicalformulation1.
Nt
= 0.2. For anytimet, s
>0,
thedistributionofNt+S
-Nt
is independentoft.3. Theaverage rate meansE[Nt]=
At
(A isthe average rate).4. Foralmost allrealizations, forsmall s, andA.as therate
P{N(t,t+s)=
0}
=l-As+ o(s);P{N(t,t+ s)=
1}
=As+o(s); and
P{N(t,t+s)>l} = o(s)
for any t,
Nt+S
-Nt
is independentof{
Nu
; u <t}.A stochastic process
Nt
satisfying the above assumptions is called a Poissonprocess with rate parameter A
(
Cinlar, 1975, Bhat,
1972).Furthermore,
the number ofevents thatoccurin aninterval oftime has aPoisson distribution with aprobabilitymass
functionofthe
form,
. .
(AtVe-M
p(x) =
; , X
- 0,1,2,. ..
In addition, the distribution of the time between events has an exponential distribution
with a mean of\IAandhasthe
following
density
function,
Ae'M,t>0.
The Poisson process is commonly used in simulation to model arrival process, such as
arrival of parts to a machine in a manufacturing system, the arrival of customers to an
ATM,
andthenumberofdefects in aboltofmaterial.3.2.2 Nonhomogeneous Poisson Processes
In many practical situations, stochastic processes such as arrival processes are
nonstationary. That
is,
the arrival rate changes over time. Examples of this may includearrivalstoabankoverthecoarseofa
day,
demand for lawn mower overthe year, andthedemand foraproduct overits life cycle. Amodel
frequently
used tomodel these types ofnonstationaryprocesses is thenonhomogeneousPoissonprocess.
AnNHPP
{N(t),t>Q}
is a countingprocess whereN(t)
is thenumber of arrivalsinthe time interval
(0,
t] wheretheinstantaneous arrival rate attime t,X(t),
is anonnegativeintegrable functionoftimeandthecorresponding
(cumulative)
mean valuefunction isThe rate or mean value function of the NHPP completely characterizes the
probabilistic behavior ofthe process
(Cinlar,
1972,
Cox andIsham,
1980).Furthermore,
the
following
propertiesholdforan NHPP:N(0)
= 0 ;P{N(t,
t +s)=0}
=1-A(t)s
+o(s) ;P{N(t,t+s)=
l}
=A(t)s+o(s); and
P{N(t,t+s)>l}=o(s).
Therefore,
by
accurately modelingthe rateormean valuefunction,
the NHPPcanbe completely defined and used to generate realizations of the process in the simulation
model. Both parametric andnonparametric methods have been developedto estimatethe
rate or mean value function from observed data. Leemis
(1991)
developed a simplenonparametric method to specify a observed piecewise-linear estimate where the
breakpoints are determined
by
the observed arrival times in a superposition of severalreplicated observations ofthe process. Lewis and Shedler
(1976b)
derive techniquesforestimating the rate function to model the transactions in a database system. One
nonparametric estimatorthat
they
derive isX(t:n,t0)
=wheret0 istheofthe time
interval,
nis thenumber of observations in(0,
t0],T0,
T\,
...,Tnarethe observations,
b(n)
is abandwidthfunction that tendsto zero as n > andW is abounded,
nonnegative,integrable weightfunction satisfyingf
W(u)du = J-aIn parametric models, several authors have proposed procedures where the
estimatedrate function is translated as piecewise-linear or piecewise-polynomial forms.
Law and Kelton
(2000)
discuss the piecewise constant method of modeling an NHPP.This procedure divides the time axis into nonoverlapping time interval where the rate is
considered to be constant and estimates a single rate foreach interval. Kao and
Chang
(1988)
simulate the times ofcalls foranalysis of electrocardiograms over several days ata hospital using piecewise-polynomial
intensity
function. Lewis and Shedler(1976a,
1979)
model NHPP withlog
linearrate function anddegree two exponential polynomialratefunction.
Asecond approachis to assumethattherate function hassome specific functional
formthathas a sufficient number of parameters to allow itto fitobserveddatawell.
Lee,
Wilson and Crawford
(1991)
illustrate a generalized model, which uses anexponential-polynomial-trigonometricfunction
(EPTF)
in theratefunctionasA(t)
= exp^r,r'
+ysin(ax+
0)
, which
they
apply tosimulate offshore weather events in the Arctic Seainvolving
both acyclic component and a trend.
Johnson,
Lee and Wilson(1994)
evaluate the EPTF-typerate function and conclude that the EPTF can adequately model an NHPP with a rate
function
having
a slowly varying long-term trend or a cyclic rate component.Further,
they
demonstrate that the parameters of EPTF-type rate function can be effectivelyestimated
by
using likelihood ratio test to determine the degree of polynomial ratecomponent and
by
solving the likelihood equations to obtain the maximum-likelihoodestimates ofthe continuous parameters ofthe ratefunction.
Kuhl,
Wilson and Johnson(1997)
further generalize this model to include multiple cyclic effects.They
propose aparametricmodel, anEPTMP-typeratefunctionoftheform
A(t)
-exp
J^af
+rtsin(fly
+0t)
t=i
to model Poisson process
having
trends or multiple periodicities. This model wasimplemented in a large-scale simulation model
(ULAM)
of the organ procurement andtransplantation network
(Pritsker,
1998). Kuhl and Wilson(2000)
formulated andevaluatedan ordinary least squares
(OLS)
procedurefor estimating the parametricmean-value function of a NHPP and demonstrated that the estimation accuracy and
computational efficiency ofthe OLS procedure in this type of application is reasonable.
During
the evaluationprocess,they
use an approximate likelihoodratio test to determinedegree of polynomial. Kuhl and Wilson
(2001)
derive a nonparametric technique titledmultiresolution procedure for estimating the
(cumulative)
mean-value function of amay not have familiar trigonometric characteristics such as symmetry over the
correspondingcycles.
The next section focuses on a nonparametric method developed
by
Kuhl andWilson
(2001)
referred to as a multiresolution method for estimating the mean valuefunction of an NHPP from observed data where the process may exhibit behavior
including
along
term trend over time as well as multiple(nested)
cyclic effects. Sincethis proposal entails automating thisprocedure (Chapter
4)
andimplementing
it in aweb-basedenvironment(Chapter
6),
thismodelis discussed in detail.3.3 MultiresolutionProcedurefor
Modeling
NHPPThe multiresolution procedure (Kuhl and
Wilson,
2001)
is used to estimatethemean valuefunction ofanNHPP
having long
term trendormultiple periodic componentsorboth. The mean value function is fit dataobserved overthe time interval
(0, 5],
wherethe data may contain a
long
term trend or p cyclic effectshaving
periods oflength,
bi(i=\,2,...,p),
respectively. The observed arrival process must statisfy thefollowing
twoassumptions:
Assumption 1. There arep distinct cyclelengths
(periods) >,
>b2
> >bp
suchthatbt
isan integral multiple of
bi+]
for i-l,2,...,p-l.Moreover,
the time horizon(0,S]
is takensuchthatS isandintegralmultiple ofbj;andlet
bo
- S.Assumption 2. Within thej"1 cycle
[(j
- 1)&,. ,;',.)
of length
bi (
i =1,2,
..,p) , the arrival
rate at time tisproportional toa singlebaseline function ofAjs), where s = t
-(j
-1)hi
is the offsetfrom the
beginning
ofthe cycle so thatMf)=aijAi(t-(j-\)bi)=aUjAi(s)i
and{
alt :I -1,2,...,
SToestimate themean value functionpif), thefunction
/?,
(s)
at each resolution forsg
[0,6,)
and i=0,1,2,...,
p needs to be estimated first. Note that at each resolution,
,(0)
=Rt(0)
= 0 andR.(bd
=/?,(,)
= 1 for i=0,1,2,...,
p . At resolution
0,
amonotonically
increasing
function R0(t),t[0,5],
is fitted to the points,{
[jb^NUb^NiS)]1:j =
0X...,S/bl
}
such that/?0(0)
=and
R0(S)
=l.Forresolution i for i =
1,2,...,
p
-1, amonotonically
increasing
function/?,
(5)
isconstructed to estimate the cumulative proportion of arrivals that are expected to occur
during
the first 5 time units of the cycle, where se[0,>()
. Thatis,
for resolutionI,
Ri
(s)
is fittedtothepoints <[jbM
,G, (jbM
)]T:
j
=0,1,...,&,
/bi+l
>, where(.sb.yi
1 ^o,)-i
1
N(S) tt
forall se[0,bt).
At resolution p, let
rk
: k =1,2,... A/(5)
denote the observed arrival times anddefine
rjk
=rkmodbp
for k-1,2,...,
N(S)
. Letrj(k)
: fc =1,2,...,
N(S)
denote theordered arrival times on theinterval [0,fe
)
. A monotonicallyincreasing
function is fittedtothepoints
j
[77(t),*/iV(5)]T:fc =l,2,...,Ar'(5)|.fi(t)
=N(S)QQ(t)
for re[0,5]
wherethefunctions {Q.
(t)
:i = p,p-1,...,1,0}
aredefinediteratively by
Qp(t)
=Rp(t-(jpa-l)bi),
andfor i =
p-1,...,1,0
Qiit)
=Rl((jl+u-V)bM-(jit-\)bl
+
[A(7,+iA>
-ih,-l)b,-R,(ji+u -l)bt+1 -(;,, -V)bt]QM(t)
where ,is aunique
integer)
suchthat(j
-\)b{
<t <jbt
.To illustrate the multiresolution procedure, an example of arrival process whose
instantaneous arrival rate over a time horizon of 28 days contains two cyclic effects
including
weekly anddaily
cyclic effects thatis,
periodic rate componentswith periodsof 1 week and one
day
is considered. The NHPP is constructed so as to satisfy all theassumptionsdiscussed inthepreviouspart, where p(5)= 216 and
Resolution0:
R0(s)= 0.0357 143s se
(0,28]
Resolution 1:
R^s)
= 0.2582035 -0.0164780s2
Resolution 2:
R2(s)=
1.832005-0.890900
s2+ 1.02624s4
se
(0,1]
Dataforrealization of arrival process is generated usingthe simulation algorithm
described
by
Kuhl and Wilson (2000). Application of the multiresolution procedurestarted
by
estimatingthefunctionRt(s)
correspondingtoeach resolution ifor i-0,l,...,p.Since the NHPP defined above consists of two periodic components, the procedure
estimated the function
R{(s)
at:(a)
resolution0,
corresponding to the long-term trendorwith a period of 28
days; (b)
resolution1,
corresponding to the cyclic rate componentwith a period of
b\
=7days;
and(c)
resolution2,
corresponding to the cyclic ratecomponent with a period of
b2
= 1 day. While estimatingRt(s),
the degrees ofpolynomials are specified
by
theuserand assumedto be known.For estimating mean-value
function,
the points of importance at resolution 0 arethose that correspond to the cumulative fraction of arrivals observed at the end of each
week. At resolution
0,
1- degree polynomial was fitted to thefraction of all arrivals thatwere accumulated at the end of each week. The estimated function at resolution
0,
isshownin Figure 3.1. Thepoints ofimportanceat resolution 1 arethosethatcorrespondto
the cumulative fraction ofarrivals observed at the end of each
day
(thatis,
at the end ofeach resolution-2 cycle). In this example, a quadratic function was fitted to the
cumulative fraction of arrivals observed at the end of each
day
of the week. Theestimated function at resolution1 is shown in Figure 3.2. The next step in the procedure
example, all the arrivals areimportant points. A fourth degree polynomial is fitted to the
observed arrivaltimes. Theestimatedfunctionat resolution 2 is shownin Figure 3.3.
Thefittedpolynomials at each resolution are:
Resolution 0:
Ro(s)= 0.0357143s se
(0,28]
Resolution 1:
Ri(s)= 0.260803665s-0.0168495 s2
se
(0,7]
Resolution 2:
R2(s)= 1.872994185-1.32037175s2-0.183941s3+ 0.631318599 s4
se
(0,1]
When all the functions are estimated, next task was to estimate the mean value
function. Theprocess ofestimatingthe overall mean valuefunction started
by
estimatingthemean value functionatresolution
0,
po(t),forte[0,28]
. Tocompletethis,A>(f)=
N(S)
G0,o(0
=N(S) R0(t)
forallte[0,28]
wascalculated.
Afterthecalculationof resolution0component, nexttask wastoincludethedetails
of the weekly periodic component to the estimated mean value function. This was done
Applying
the equations, thefollowing
estimate of mean value functionwas acquired for
thefirstweek:
Mt)
=N(S)
<2o,i(0=
N(S) /?o(7) Q,,i(0
=
N(S) R0(7) Ri(t)
for all re [0,28],The final step was to include the details of the
daily
periodic component to theestimated mean value function calculating/i2(t), for te [0,28].
Applying
the mean valuefunction equations, the
following
estimate of mean value function was acquired for thefirst day:
/u2(t) =
N(S)
Q0,2(r)
=N(S) R0(7) Qh2(t)
=
N(S) RoO) Ri(l) Q2,2(t)
=
N(S)RQ(l)Rx(\)R2(t)
forall re [0,28]. Themean valuefunctionwas calculated at resolution 1 forsubsequent
weeks and at resolution 2 for subsequent days. Since the process consists of only two
periodicities, the final estimate of the mean value function was p,2(i). The fitted mean
valuefunction plotted against the observed cumulativenumber of arrivals overthe entire
14
Time(Days)
21 28
Figure 3.1 Estimated Linear Function atResolution 0
CD
>
<
"o
O
CO
CD
E
O
0.5
3 4
Time(Days)
0.2 0.4 0.6
Time(Days)
0.8
Figure 3.3 Estimated Quartic Function atResolution 2
Kuhl and Wilson
(2000)
appliedthe method ofinversion to generate realizationsofthe fitted function while simulating NHPPs
having
a multiresolution type mean valuefunction.
Clearly
the efficiency of this simulation scheme depends on the numericalmethods applied to invert estimates of the functions. A nonparametric estimate of the
mean value function of an NHPP
having
long-term trend or cyclic effects that mayexhibit nontrignometric characteristics canbe obtained
by
the multiresolutionprocedure.Themultiresolution procedure allows us to model asymmetric periodic behaviorwith no
14
Time(Days)
21 28
Figure 3.4EstimatedMean ValueFunction (smooth curve) Superimposedonthe
Observed Cumulative Arrival Function
(Step function)
3.4 Estimation Procedures
In order to automate the multiresolution procedure described in the last section,
this research designs a method, which includes a likelihood ratio test (Chapter 4).
Therefore,
we will review, in this section, maximum likelihood estimation, hypothesistesting, andlikelihoodratiotests.
3.4.1 MaximumLikelihood Estimation
One of the most commonly used estimation method is maximum likelihood
estimation. Maximum likelihood estimation has been described as the originaljackknife
becauseofits applicabilitytoawide varietyof problems
(Breiman,
1973). Themaximumspace i^thatwillmaximizethevalue ofthelikelihoodfunction. SupposethatX=
(X\,
...,
Xn
)
are random variables with jointdensity
orfrequency
functionf(x:0)
where 0e y/.Given theoutcomesX=
x,thelikelihoodfunction is
L(0;x)=f(x;0)
(Casella and
Berger, 1990, Ross,
1989). The distinction between these two functions isthe variable that is considered fixed and the one that is varied. When the pdf
/
(x;0)
isconsidered, 0 is considered as fixedandxas the variable; when thelikelihood function
L(0;x)
is considered, xis consideredasfixedand 0 asthevariable.If the data range does not depend on the
data,
the parameter space \ffis an openset, and the likelihood function is differentiable with respect to 0-(0\
0P)
over y/,then themaximumlikelihoodestimate
(0)
satisfiestheequationsd\n(L(0))
=0 for k =
1,2,....,
p .
d0k
These equations are called likelihood equations and
\n(L(0))
is called log-likelihoodfunction
(Knight,
2000).There are twoinherent drawbacks associated with thegeneral problem of
finding
the maximum of a
function,
and hence of maximum likelihood estimation. The firstproblem is that of actually
finding
the global maximum and verifying that,indeed,
aarise. The second problemis that of numerical sensitivity. It is sometimes the case that a
slightlydifferent sample will produce a
totally
differentMLE,
making itsuse suspect.In some cases, function
R(0)
of parameters 0 needs to be estimated. In suchsituation, one of the most important properties of maximum likelihood estimators what
has come to be known as the invariance property of MLE is applied. The invariance
property states that if 0 is theMLE of
0,
then for any functionR(0),
theMLE ofR(0)
is7?((9)(CasellaandBerger,
1990).Maximum Likelihood estimation provides the theoretical basis on which
statistical tests such as hypothesis tests (and in particular likelihood ratio tests) can be
developed.
Therefore,
the idea ofhypothesistesting
is discussed first in the next sectionandthen thelikelihoodratiotestis elaborated.
3.4.2 Hypothesis
Testing
A statistical hypothesis is a statement about the parameters of one or more
populations andthe
decision-making
procedure about the hypothesis is called hypothesistesting. Statistical hypothesis
testing
can be divided into six steps. The first step in anytype ofhypothesis
testing
is toidentify
the parameterofinterest. In the currentproblemdomain the diameter of rod is the parameter ofinterest. The second step is to state the
null hypothesis andto specify the alternative hypothesis. In the third step, a significance
level is selected. As the significance level
increases,
the precision in the decisiondecreases.
Here,
the significance level is the probabilityof obtaining accurate estimation.In the fourth step, an appropriate test statistic is determined. In the next step, the test
hypothesis
testing
comprises whether to accept or reject the null hypothesis and torepresentthatintheproblem context.
A widely used method of test construction is the likelihood ratio principle.
Statistical methods obtained
by
using likelihood ratio principle often turn out to havesmallest type II error among all tests thathave the same typeI errorprobability
(Nelson,
1995).Thesetests are called aslikelihoodratiotests(LRT).
3.4.3 Likelihood RatioTests
The likelihood ratio method of hypothesis
testing
is related to the maximumlikelihoodestimators, andthe likelihoodratio tests are as widelyapplicable as maximum
likelihood estimation. Recall that if
X\,
...,Xnis arandomsample from a population with
probability distribution
function/(jc;(9),
thelikelihood function is definedasL(0;xx,...,xn)
=L(0;x)
=f{x\0)
=f[f(xl;0).
i=i
The likelihoodratio statistic%is definedas
Sup
L(0;
x)*(*)= ""
SupL(0;x)
vwhere
L(6;x)
is a likelihood function. A likelihood ratio test isHQ:0Ey/x
versusH.:0Ey/[
where y/=critical ratio value c andacceptingif %(x)> c (Casella and
Berger,
1990, Brieman,
1973).Even if the two suprema of
L(0,x),
over the sets y/\ and y/, can not be analyticallyobtained,
they
can usually becomputed numerically.Therationale behind LRT may best beunderstoodinthe situation in which
f(x;0)
isthe pdf of adiscrete random variable. In this case the numerator of%(x) is the maximum
probabilityofthe observedsample, themaximum
being
computed overparametersinthenull hypothesis. The denominator of %(x) is the maximum probability of the observed
sample over all possible parameters. The ratio ofthese twomaxima is small ifthere are
parameter points in the alternative hypothesis for which the observed sample is much
more
likely
thanfor anyparameter pointin thenull hypothesis. In this situation, theLRTcriterion says
Hq
shouldberejectedandH\
accepted astrue(Knight,
2000).In this thesis work, alikelihood ratio test is designedto determine the degree of a
polynomial. Thistestautomatesthemultiresolution procedure.
3.5Web-Based Simulation
After reviewing web based simulation research papers from
1996-2001,
it isobservedthat the speedin which the web based simulation is progressing, andthe desire
to create complex simulation models and perform simulation operations on web will not
be possible inimmediate future.
However,
According
to Kuljis and Paul(2000),
there iscertainly promise for certain type of applications to utilize the advantages of Internet
computing andthe web. Themost
likely
candidates are applications that deal with hugequantities of data such as meteorological modeling; applications that enable users on
customer (forexample,
manufacturing
which offers customizedproducts) or applicationsineducation and
training
whichincreasingly
have tocatertodistancelearning
students.As far as web based simulation is concerned, there are currently two main
approaches that are both based on the Java language. One approach develops a new
simulation language using
Java,
and the other approach ports an existing simulationlanguage,
such asGPSS,
and creates it in Java (Whitman 1998). Advances in Javadevelopmenttools andJavaBeans
technology
thatcanfacilitate developmentsofreusablecomponents in combination with the ability to distribute and execute models via the
Internet may foster increased activity in the development of
high-level,
domain-specificsimulation toolsaimed attheend users.
The integration ofthe web and Javarepresents a technological advancement that
enables a
fundamentally
new approachto simulation modeling.Healy
(1998)
attributestothe role, the Java language plays in both the specification and implementation of the
model. Paul
(2000)
explains how Java is beneficial for web-based simulation. Kilgore(1998)
lists Java features thathave the potential todramatically
change the processusedtobuildcomputer simulation models.
Kuljis and Paul
(2000)
discuss several Java based discrete simulationenvironments and conclude that developments are surprisingly conservativeand lack the
veryentrepreneurial nature of media
they
aretrying
to conquer.Most ofthe workfocuseson the tools that are
being
used(Java,
Javabeans etc.) rather than how the new mediumcan enhanceour use of simulation as amodeling vehicle. As it
is,
most applications are"invented"
and answer the question "
What can I put on the
web"
simulation has to be available on the web, how can I design to facilitate the needs ofits
users."
The main architectures of web-based simulations have been shown and tested
successfully
during
recent years (Fishwick1996,
Healy
and Kilgore 1997). Wiedemann(2001)
compares common web technologies with simulation characteristics as shown inTable 3.1. The comparison reveals large differences concerning all criteria. A simple
transformation of actual simulation techniques into the web will result in the same
disadvantages plus some new restrictions and problems caused
by
the web itself. Themain cause forthe differences consiststhe variability in the differenttypes of operations.
A typical Internet surfer essentially works in a passive, assimilating mode. His creative
workis limitedtodecisions aboutselecting thenext
interesting
linkor newcombinationsof search criteria. A wrong decision can be promptly corrected
by
pressing the "Back"button.
Modeling,
simulation and result analysis require a high level of active andcreative work. A problem of web based simulation systems is the high complexity of
theiruserinterfaces.
Fishwick
(1996)
has divided the web based simulation in three topics:1)
Education and
Training
2)
Publications3)
Simulation Programs anddiscusses proceduresfor
implementing
web technologies and concepts in simulation modeling. Yu-hui Taoand
Shin-Ming
Guo(2001)
designedawebbasedtraining
systemforsimulationanalysis.According
toPidd, Oses,
Brooks(1999),
various forms of distributed simulation arepossible over the world wide web,
including
simple multiple replications of the samemodel, clientserver architecturesforoneormore simultaneouslyrunningmodels andthe
component-based approach for web-based simulation. Alfonseca
(2000, 2001)
has builtandexecuteddistributedweb basedmodels.
Table 3. 1. ComparisonofWeb andSimulation
Technology
CharacteristicsCommon webtechnologies Traditionalsimulation
technologies Common
Standards
Yes
(HTML, XML, TCP/IP)
Nocommonstandardformodel andresults
Data
handling
By
uniqueURLProprietary
Information
Structures
Treestructures andlists
Very
complexSpecialist
Knowledge
No
(only
clickingandreading)
Yes (fromalotof scientific areas)
Navigation
Easy
DifficultEase ofUse
Very
goodVery
difficultTypeof Operations
Readand analyze
information
Synthesizemodels and analyze
results.
Seila and Miller
(1999)
present a design for scenario managementin web basedsimulation. The objective of the design is to perform output analysis on the web. The
JSIM environment provides the requirements for scenario management and output
analysis. The web based interface created
by
Guru, Savory,
Williams(2000)
connectedSEVIAN server, Oracle server andApplication serverto allow users to store and execute
SEVIAN simulation models overtheInternet.
Bodner,
Govindraj (2000)
connectedArena,
CORBA and JAVA applets to design a web-based simulation environment to support
modelinganddesignofmanufacturingand material
handling
systems.Current Java basedsimulation languages are Simjava (Howell and McNab
2000),
DEVSJAVA (Sarjoughian and Zeigler
1998),
JSIM (Miller1998), Javasim,
JavaGPSSenvironment) combines web
technology
with the use ofJava and CORBA (Iazeolla andD'Ambrogio 1998). Most of the
languages
are Java versions of existing simulationlanguages.
The concept of open source
modeling
(Wiedemann, 2001)
represents anopportunity to improve the quality of common core simulation
functions,
improve thepotential for creating reusable modeling components
using
XML,
HLA and othersimulation community standards. It aims at
leveraging
the unique communication anddistribution opportunities created
by
the Internet to open the development of simulationsoftwareto aworldwidecommunityoftalentedsoftwaredevelopers andresearchers. This
conceptis atextremely earlystage ofdevelopment.
In contrast, current object oriented commercial simulation languages are capable
of modeling any type of complex systems. Benson
(1997)
illustrated ProModel as apowerful yeteasy to use simulation toolfor modeling all types ofmanufacturing systems
ranging from small job shops and machining cells to large mass production, flexible
manufacturing systems, and supply chain systems. Most of the languages are windows
basedsystems withintuitive graphical interfaces and object-orientedmodelingconstructs
that eliminate the need for programming. Swets and Drake
(2001)
have introduced thenew Arena product
family,
which allows the modeler to creatediscrete,
continuous aswell as combinedmodels. Arenacombinesthe ease of usefound in high-level simulators
with the
flexibility
of simulationlanguages,
and even all the way down to generalpurpose procedural languages such as the Microsoft VB programming system orC ifthe
user needs. The hierarchical structure of Arena allows user to shift from lower level
discussion,
it can be concluded that current commercial modeling languages are moreuser
friendly
as compared to web-based simulation environments.They
allow themodeler to construct complexities in the model and the level of
flexibility
is high foranimation, output analysis and other simulationstudyrequirements.
InChapter 5 andChapter
6,
a web-based simulation infrastructure is demonstratedutilizing current web technologies and current web-based simulation languages. This
4. AN AUTOMATED
MULTIRESOLUTION
PROCEDURE FORMODELING NHPPSThe objective of this chapter is to
develop
a method for automating themultiresolution procedure for modeling NHPPs. The multiresolution procedure, as
discussed in Chapter
3,
requires several components tofully
automate the procedureincluding
the selection and estimation of a function to be fit at each resolution. Insummary to automate the procedure, a special form of a polynomial is selected to fit at
each resolution. The polynomial at each resolution is estimated using ordinary least
square procedure (Kuhl et al. 2000). A likelihoodratio test is designed to determine the
degree of each polynomial. The likelihood ratio test requires the
data,
which havenormalized residuals.
However,
Kuhl and Wilson(1997)
show that the data observedfrom anonhomogeneousPoisson processdo notpossessnormalized residuals.
Therefore,
to apply likelihood ratio test, the original observed data from the nonhomogeneous
process is transformed using an arcsin square root transformation to obtain data with
normalized residuals. This transformed data is utilized only to determine the degrees of
polynomials. Once the degrees are
determined,
these degrees are used to estimatepolynomials from original data. These polynomials are utilized to calculate the mean
value function to representthe NHPP. The derivation along with a demonstration ofthe
automatedmultiresolutionprocedure with thelikelihoodratiotest is discussed below.
4.1 Input
Modeling
and theNeed for AutomationIn thewords ofFriedrich Engels (1820-1895), "An ounce of actionis worth aton
of
theory."
This statement can be applied effectively to many fields of research. In the
input modeling,
however,
without a method developed forimplementation;
the modelmay not be used in actual practice. Due to the high cost and large amount of time
required for
implementation,
the use of a model may become economically prohibitive.Kuhl and Wilson
(2001)
establish the theoretical basis and general methodology formodeling NHPP's usingthemultiresolution procedure.
However,
fora simulation analystto apply the multiresolution procedure in practice, the analyst must complete the
following
tasks:1. Obtain an appropriatefunction thatfollowsthe theoretical requirements;
2. Obtainan appropriate methodfor
fitting
thefunctionto dataateachresolution;3.
Verify
that themodel meetsthe theoretical assumption ofthe procedureas well astheassumption ofan
NHPP;
4. Validatethe model;
5. Writecomputer code forthe
fitting
procedure; and6.
Apply
themodel to thesystem under study.Therefore,
this research workdevelops,
tests and implements an automated procedurethat will satisfy the first 5 criteria,
leaving
the useronly with applying the model to thesystemunder study.
4.2Selectionof an Appropriate Function for
Rt
()
Recall that in the multiresolution procedure, at each resolution level
i,
anestimator
Ri
(s)
ofthefunctionRt
(s)
mustbeobtained withthefollowing
properties: theinitial value /?,.
(0)
=Ri
(0)
=0;
the final valueRt
(bi)
=Ri
(bt )
=1;
and the derivativeR'(s)
>0 for se[0,
bt]
and i =0,
1, 2,
At resolution level
0,
a monotonicallyincreasing
functionR0(t),
re[0,5]
isfitted,
to the points{
[jbx,
N\jbx)
I N'(S)]r:
j
=0, 1,
...,
Slbx
}.
At resolution level ifor / =
1, 2,
..., p
-1, a monotonically
increasing
functionRi
(s)
is constructed toestimate thecumulative percentage of arrivals thatare expectedto occur
during
thefirststimeunits ofthe associated cycle oflength
bn
where s e[0,
&,.]
. Thatis,
forresolutionlevel
i,
the functionRt
(s)
is fittedto the points{
[
jbM,
GZjb^)]1 :j=0,1,
...,bi lbM
},
where
G,(s)
=-i-[/V(^+s)-W>.)]
(1) N (5
)
?=0forall se[0, &.].
Atresolutionp, let
{rk
: k =I,
2,
...,
N(S)
}
denotethe observed arrival times;and define
r/k
-rk modbp
for k -1,
..., N(S). Let
r/w
<<^(i)
denote the orderedarrival times within the level-/? cycle
[0,
bp
]
. Then a monotonicallyincreasing
functionRp(s)
is fitted to the points{[r/(k),
k/N(S)]T
:k =
l,
...,
N(S)}
subject to the usualboundary
conditions that7^/,(0)
=0and^p(Z?/))
=l. The final estimateju(t)
ofthe meanvaluefunction is computedasfollows:
wherethefunctions
{2,.(r)
: i =p, p-l, ...,
1,
o}
aredefinediteratively
by
QP(t)
=Rp(t-(jpJ-l)bp),
andfor /=
/?-!,...,1,0, take
Q,(t)
=Ri((j,+l,,-i)bl+l-(jil--i)bi)
+
lR,(Jl+l,,bM-Ui,<-Vbl)
-^(O",-+i,-1)^+1-O'1-,-1)^)]qi-+i(0,
(2)
wherethe interval containingtis within each resolution isgiven
by
theindex,
y,,isthe uniqueinteger
j
such that(j-\)bi
<t<jbt
.In order to construct the mean value function utilizing the multiresolution
procedure, the functions {R)(s): i =
0,1,2,...,/?}
at each resolution must be selected.Therefore,
a special form of a polynomial that meets these requirements namely, adegreerpolynomial oftheform
Ri(s)
=sib,,
E/M'/M'
+
fi-I^
r-1 ( r-1 A
k=\
if r =
1,
(s/bi)'
if r >1,
(3)
for se
[O.fcJ
is utilized for i =0,
1,
2,
...,p. The coefficients{pk
:k=\,
...,k-l}
in(3)
are constrained to yield #,'(s)>0 forall s e[0,
fc.]. This polynomial automaticallysatisfies the required
boundary
conditions.Further,
the simplest process that can bemodeledis theone with a constant arrival rate overtheinterval
[0, bt],
which correspondsto the simplestformofthemodel
(3)
wherein r= 1 istaken (thatis,
alinear mean valuefunction).
4.3
Estimating
/?.()
To estimate the function
(1)
by
a fitted functionR,
(s)
of the form(3)
at eachresolution level
i,
the appropriate degree rofthe polynomial(3)
must be determinedandthen the associated polynomial coefficients p\,
p.,
...,/3r-\
are estimated. The standardapproach for solving such a problem is to perform regression analysis using a forward
selection orbackwardelimination procedure (DraperandSmith 1998).
In a forward selection procedure, polynomials of degree r- 1
and r are fitted to
the sample
data;
then the mean square erroris computed foreach fit and an appropriatelikelihood ratio test is performed to determine if significant improvement in the fit is
achieved
by
increasing
the degree of the fitted polynomial from r - 1 tor. If a
significantly better fit is obtained withthedegree-rpolynomial, then the
fitting
procedureis continued
by
incrementing
the value of r and repeating the likelihood ratio test;otherwise, the
fitting
procedureis terminated,delivering
r- 1as the degree ofthe fitted
However,
the assumptions of classical regression analysis require observations ofadependent variable thatare independentand normal with a constant variancesothat the
corresponding residuals are independent and
identically
distributed normal randomvariables. In the case of
fitting
arrival data obtained from an NHPP that satisfiesAssumptions 1 and
2,
the observed cumulative relative frequencies within each basiccycle are neither normal nor
independent;
moreover, such responses do not possess aconstant variance.
Therefore,
a variance stabilizingtransformation mustbe appliedto thedata before standard statistical procedures to determine an appropriate degree r for the
function
R(
(s)
are used.4.4 Likelihood Ratio Test Procedure
At each resolutionlevel
i,
let mt =bjbi+x
-1 for0<i<p
-1 and mt =N(S) fori =p; andapolynomial function
Rt
(s)
ofthe form(3)
is fitted with appropriatedegree rtoaset of points
having
thegeneralform(Zj,W.)T:j=l,...,m.l (-*)
asdefinedin 1. Forexampleiftheresolutionlevel iis in therange
{l,
...,p-l}, thenat
the;thpointin
(4)
theabscissaZj=jbM
andthe ordinateWj=Gi(jbl+l)
for;=
1,
..., m,aretaken. Since
Wj
is alwaysa proportion, thevariancestabilizingtransformationYj^sm^ljW^l
for ;=L ...,/n,.is exploited. See pp. 231-238 of
Box, Hunter,
and Hunter(1978)
and pp. 291-294 ofDraperandSmith (1998).
Corresponding
tothe dependentvariable Y. definedby
(5),
theindependentvariableX . =
Zj/bi
=jbM/b;
forj
=1,
..., mi ;
(6)
is definedandinterms ofthevectorofregression coefficients
c
if r =
u
'"[[C1,C2,...,Cr_1],
if r>2,(7)
forthe degree-rpolynomial
T \u;C =
^
w u,
r~x k
k=\
--I
c, 2 7 1 *v fc=l y
r
ifr=l,
ifr>l,
(8)
The
following
statistical model for yyasafunction ofX;
is postulated. If
(9)
is valid, then the transformation(5)
yields approximately normalresiduals with constant variance a2
sothat
{ej-.j
=l,
...,m,\
~ N(0,a2).
(10)
Noticethatin
(8)
Tr(0;Cr)
= 0 and Tr(1;
Cr) =n/2.
(H)
For fixed r (r =
2, 3,
...), the estimator
Cr
of the coefficient vectorCr
is thesolution ofthe
following
constrainedleast-squaresregressionproblem:m.
minC y
ri=i
Y.-T
\X
,CJ n J r
(12)
subjectto
Observethat
r;(:cr)=f
^u^+rf^-f
C*=1 V 2 4=1
fr r-1 >
ur~\
(14)
so that
(13)
canbe interpretedas requiringthezeros ofthedegree-(r-1)