EFFICIEN T SEM IP A R A M ET R IC P R ED ICT IO N IN T ER V A L S A N D R EG IO N S
B ryan W . B rown1
T heconstructionofpredictionintervals andregionsandtheirprobabilitycontentfornonlinearsystems with nonparametricdisturbances is considered. T hesemiparametrice¢ciency bound forestimatingthe probability contentofa known interval(region) and estimators thatattain the bound are developed. Semiparametrice¢cientestimation ofoptimalprediction intervals (regions) which either(i) maximize probabilitycontentgivenintervallength(regionarea) or(ii) maximizeintervallength(regionarea) given probabilitycontentis studied. T heestimatedprobabilitycontentof(i) is foundtohavethesamelimiting behavioras ifthe interval(region) were known with certainty and hence attains the semiparametric e¢ciency bound. Further, the estimated probability ofthe estimated interval(region) approximates the true coverage probability toorderpn for(i) butordersmallerthanpn for(ii). A M onte Carlo experimentis conducted tocomparethenewpredictors tocompetitors.
Keywords: Semiparametrice¢ciency bound, optimalprediction intervals, prediction regions, nonlinearsystems
1. Introduction
T he conditionalprediction problem involves developing knowledge of the distribution of thepredictandvariable(s) outsidethesampleperiod givencertain conditioninginformation. T he attribute of this distribution that has received the most attention is its center, as manifested by, say, the conditionalmean. T his point prediction problem is well-studied forlinearmodels, where the problem reduces to …nding the centerofthe distribution of the disturbances, typically zero, and substitutingparameterestimates intothe conditional mean function. In nonlinear(in thevariables) models, this problem involves morecomplete knowledgeofthedistributionofthedisturbances, andhas beenaddressedthroughtheuseof simulation techniques toestimatetheconditionalmeanusingdraws from an estimateofthe distribution ofthedisturbances. T his distribution could eitherbe parametricas in H owrey and Kelejian(19 7 1) ornonparametricas inB rown andM ariano(19 8 4).
B eyond pointprediction, we are interested in determininga range ofvalues ofthe pre-dictand variables and some measure ofthe probability offalling in the range. Fixingthe probabilityatagiven value, theproblem becomes oneofdevelopingan appropriateinterval in the univariatecase and an appropriateregion in themultivariatecase. A gain, forlinear models, agreatdealisknownif, in addition, thedisturbances areassumedtobenormal. In this case, thepredictand is conditionally normaland completely characterized by its mean and covariancematrixandtheconstructionofpredictionintervals andregions is straightfor-ward and well-studied. In particular, thetargetintervals and regions can berepresented as knownfunctions ofestimatedparameters thatareappropriate, atleastasymptotically and, forsomecases, in…nitesamples. Similarly, ifthedistribution ofthedisturbances is nonnor-malbutstillparametricallyspeci…ed, theintervals and regions cangenerallyberepresented as known functions ofestimated parameters thatare, atleastasymptotically, appropriate.
U nfortunately, ifthe modelis nonlinearin the variables orthe distribution ofthe dis-turbances is nonparametricthen theconstruction oftheintervals and regions is somewhat
1P artialsupportfrom N SF grantSES-9 9 058 16is gratefully acknowledged. T heauthorwishes tothankM ahmoud
more complicated and less wellstudied. A lthough the construction ofprediction intervals and regions is an obviously importanttopic and mostpredictive models are nonlinearin thevariables, therehas beenverylittleworkconcerningthebehaviorofprediction intervals and regions in nonlinearsimultaneous systems. L ikewisetherehas been very littleworkon prediction intervals andregions in linearmodels when thedistribution is notspeci…ed. A n exception is the unpublished paperby B rown and M ariano(19 9 1), which considers M onte Carlosimulation-based prediction intervals andregions when thedistribution ofthedistur-bances is known and residual-based prediction intervals and regions when the distribution in notspeci…ed.
T he purpose ofthis paperis todevelop techniques appropriate forconstruction ofpre-diction intervals and regions when the disturbances are nonparametricand the structural modelis nonlinear. T heresearch presented in this paperbuilds on the residual-based tech-niques in B rownandM ariano, which areappropriateformodels wherethedisturbances are independentoftheregressors.T heapproachintroducedinB rownandN ewey(19 9 8 )forsemi-parametrice¢cientestimation ofexpectations is appliedtodevelop semiparametrice¢cient estimates ofthe probability contentofknown intervals and regions forparametric models withnonparametricdisturbancedistribution.B eyond knownintervals and regions, intervals and regions thathaveasymptoticoptimalpropertiesinthesenseofminimalareaforagiven probabilitycontentormaximalprobabilitycontentforagiven areaaredeveloped.A lthough explicitlydeveloped fornonlinearsystems, thetechniques shouldapplyequallywelltolinear systems withnonparametricdisturbancedistribution.
T heoutlineofthepaperisasfollows. Inthesecondsection, thebasicmodelisintroduced andrelevantpreviousresultsreviewed. T heconstructionandpropertiesofe¢cientprediction intervals forsemiparametricmodels, includingoptimalintervals, are presented in thethird section. Inthefourthsection, e¢cientsemiparametricpredictionregions, includingoptimal regions, aredeveloped. T he relative behaviorofthe proposed optimalpredictors and their competitors are presentedand contrasted viaaM onteCarlostudyin the …fth section. T he results aresummarized and anumberofpromisingpossible extensions are discussed in the …nalsection. Forthepurposes ofthis paperconstruction ofaprediction intervalorregion is construedtoincludeboththeestimationoftheintervalorregion givenaprobabilitycontent and estimation oftheprobabilitycontentgiven theintervalorregion.
2. B asic Concepts
In this paper, we willconsider prediction in a static nonlinear system with independent errors. T hedatageneratingprocess forsuch asystem can berepresented
y= ¼(";x;¯) (1)
whereyis ag£1 vectorofendogenous variables, xis ak£1vectorofexogenous variables, "is ag£1 disturbancevector,¼(¢)is ag£1 vectorofknown functions, and¯ is ap£1
vectorofunknown parameters. T he vectorofdrivingvariables("0;x0)are assumed tobe
jointlyi:i:d. withajointdistribution thatsatis…es independenceof"andxbutis otherwise unspeci…ed and unrestricted, exceptforsomesmoothness restrictions. N otethedistribution ofthedisturbances"areallowedtohaveanonzerolocationparameter, say® = E ["], sothe
parametervector¯ does notincludean intercept.
W e assumethatthe relationship betweenyand", givenx, is one-to-one. T hus, we will restrictourattention tomodels which havetheunique inverserepresentation
"= ½(y;x;¯): (2)
where½(¢) is a known function. T his inverse representation is usually interpreted as the
structuralform ofthe modelwhile the data generating process is the reduced form. A numberofmodels fallwithin this framework includingthe linearand nonlinearregression models and thelinearand nonlinearsimultaneousequation models. Inpractice, thereduced form correspondingtoaparticularstructuremaynotbeavailableinclosed form butcan be obtained through numericaltechniques.
W eareinterestedintheconditionalprediction oftheendogenous variables forsomeout-sidesampleobservation, denotedbysubscript¿, given thevaluesoftheexogenousvariables. T he prediction approaches introduced below willdepend crucially on estimation
ofcondi-tionalexpectations ofknown functions ofthe endogenous variablesy¿ given the exogenous
variablesx¿. D ue to the independence assumption, such expectations have the canonical
representation ¹(¯ ;h) = E ¯;h[g(y)jx= x¿)]= Z g(¼(";x¿;¯))f"("jh)d " (3) = Z g(¼(½(z ;¯);x¿;¯))fz (z j¯;h)d z = Z m (z ;¯)fz (z j¯ ;h)d z
whereg(¢) and hencem (¢) = g(¼(½(¢;¯);x¿;¯)) are knownq£1 functions, f"(¢jh)is the
density of", fz (z j¯ ;h) is the density ofz , andh is an unknown function to re‡ectthe
distributionfreenatureofthespeci…cation. T helastlineisthecanonicalexpectationstudied in B rown and N ewey(19 9 8 ) with restrictions impliedbytheform in thesecond line.
T he essentialcomplications in the estimation of¹ are the unavailability of¯ and the
inabilitytoperform theindicatedintegrationsincethedistributionisunspeci…ed. T hemost
naturalresponse tothesecomplications is toestimate¯ with, say, ¯^and approximate the
weproposethe method ofmoments estimator ^ ¹ = n¡1X n t= 1g(¼(b"t;x¿;^¯)) (4) = n¡1X n t= 1g(¼(½(z t;b¯);x¿;^¯)),
whereb"t= ½(z t;b¯). T hefunctionsg(¢)andm (¢;¯)
areunrestrictedexceptforsomesmooth-nessintheexpectationofthelatter, whichwillbeimposedbelow. T his is theresidual-based estimatorofthetargetexpectation proposed byB rown and M ariano(19 8 4).
Itis instructive toexamine severalexamples ofresidual-based estimators considered by B rown and M ariano. Forpointprediction weareinterestedin°(x¿) = E [y¿jx¿], whereupon
g(y) = y, andthemethod ofmoments estimatoris given by
b °(x¿) = n¡1 X n i= 1¼(½(z i;¯b);x¿;b¯): (5) Inmeasuringpredictiveaccuracy, thesecondconditionalmoment- (x¿) = E [(y¿¡°(x¿))(y¿¡ °(x¿))jx¿]is importantandmaybeestimated by b - (x¿) = n¡1 X n i= 1(¼(½(z i;b¯);x¿;¯b)¡b°(x¿))(¢) 0. (6)
A nd the conditionaldistribution functionF1(c;x¿) = E [1(y1 ·c)jx¿], which is ofdirect
interestbelow, hasg(y) = 1(y1 ·c)with
b
F1(c;x¿) = n¡1
X n
i= 11(¼1(½(z i;¯b);x¿;¯b)·c) (7 )
as its method ofmoments estimator.
Supposethatthedataaregenerated byaparametricmodelwhichsatis…es
thesemipara-metricassumptions and contains the truth. Such a modelis called aparametric submodel
sinceitis asubsetofthemodelconsistingofdistributions satisfyingtheassumptions. For-mally, wesuppose
z »f(z j¯0;h(´0)) (8 )
where´0 is a …nite-length vectorofshape parameters forthe true distributionf(¢), and a
zerosubscriptindicates thetrueparametervalue. T hesetofparametricsubmodels, then, is de…ned as thesetofdistributions which satisfythesemiparametricassumptions and
fi(z j¯;hi(´i)) = f(z j¯0;h(´0)) (9 )
forsome µi = (¯0;´i0)0= (¯00;´i00)0= µi00and allz , where´iis a shape parameterfor
parametricsubmodeli. N otethatthelength oftheshapeparametervector´iand henceµi
may di¤erfordi¤erentparametricsubmodels.
Projections ontothe spaces spanned by the scores ofthe parametricsubmodels
areim-portantin determiningthe semiparametricestimators and e¢ciency bounds. L etSi
µ(z ) =
(S¯i(z )0;S´i(z )0)0denote thescores ofaparametricsubmodel. D e…nethenonparametric
tan-gentsetas themean squareclosureoftheunion ofallpossible q-dimensionallinearcombi-nations ofS´(z ), i.e.
whereBjareconstantmatrices withqrowsandE [¢]denotesexpectationatthetruth. N ewey
(19 89 ) has previouslystudiedtheestimationoftheparametervector¯ forthepresentmodel
undertheindependenceassumption and shownthatthenonparametrictangentsetis given by
T= ft1(") + t2(x) :E [t1(")]= E [t2(x)]= 0g; (11)
wheret1(¢)andt2(¢)areunrestrictedfunctions exceptforthemean zeroproperty. N otethat
theresidualoftheprojectionofthescoreS¯(z ), foranyparametricsubmodelwhichincludes
thetruth, on the nonparametrictangentset,
S(z ) = S¯(z )¡P roj(S¯(z )jT); (12)
is known as thee¢ cientscorefor¯, whereP roj(g(z )jT)denotes the projection ofg(z )on T.
Similarly, de…ne thetangentsetS as the mean square closure ofthe union ofall
q-dimensionallinearcombinations ofSµ(z )forallregularparametricsubmodels satisfyingthe
semiparametricassumptions, i.e.
S= fs2Rq:E [s0s]< 1;9A
j;Sµj(z )s.t.E [ks¡AjSµj(z )k2]= o(1)g, (13)
whereAjareconstantmatriceswithqrows. A smightbeexpectedthereisacloserelationship
between thetwotangentsets. M ore compactly, wecan writeS= fB s¯ + ¿:¿2 TandB
is aconstantq£p matrixg. B y de…nition, B s¯ = B s+ B P roj(s¯jT) = B s+ ¿ for¿ 2 T,
whereuponS = fB s+ ¿ :¿ 2 Tg. N ote thatthe twocomponents are orthogonal, which
implies thataprojection ontoS can beobtained as the sum oftheprojection ontothetwo
components.
Sincea distribution is notexplicitlyspeci…ed in obtaining¹^, the
estimatorwillbesemi-parametric if¯^ is semiparametric. Speci…cally, ¯^ should remain consistent for any
dis-tribution satisfyingthe semiparametric assumptions. A ccordingly, we make the following assumptions and obtain theaccompanyingT heorem. Proofs aregivenin the A ppendix.
A ssumption1: ¯^is asymptotically linearwith in‡uence functionÃ
¯(z ), E [ï(z )] = 0,
andV¯ = E [ï(z )¢Ã¯(z )0]…nite.
A ssumption2: M (¯) = @E [m (z ;¯)]=@ ¯0exists and continuous on a neighborhood of ¯0.
A ssumption3: n¡1=2 Pn
t= 1[fm (z t;¯)¡E [m (z ;¯)]g ¡ fm (z t;¯0)¡¹0g]stochastically
equicontinuous at¯ = ¯0.
A ssumption4: Vm = E [(m (z ;¯0)¡¹0)¢(m (z ;¯0)¡¹0)0]exists and …nite. T heorem1: Suppose A ssumptions 1-4 are satis…ed,then
n1=2(^¹ ¡¹0)
d
¡! N (0;V¹); (14)
T hisresultdemonstrates thatthemethodofmoments estimatoris consistentandasymp-toticallynormalunderfairlystandard conditions. Sincewearecomparingtheestimators on thebasis ofasymptoticvariance, A ssumption4, whichassumes theexistenceofavarianceis fairlyinnocuous. Itwillbesatis…edforindicatorfunctionssuchasusedbelow. A ssumption3, thestochasticequicontinuityassumption, willbemetif, forexample,(m (z t;¯)¡E [m (z ;¯)])
satis…es a centrallimittheorem throughouta neighborhood of¯0:In particular, ifm (¢) is
an indicatorfunction, as below, this condition willbemet. A nd thecontinuous di¤erentia-bility condition, A ssumption 2, is the standard approach forobtainingderivative terms in theasymptoticexpansion when theunderlyingfunctions arediscontinuous.
B asedonthelimitingcovariancematrix, alternativemethodofmomentsestimators based
ondi¤erentestimatorsb¯ canberanked interms oftheV¯. T his suggests thatalowerbound
ofsome sortis attained ifthe estimatorb¯ is itselfsemiparametrice¢cient. T he theorem
belowveri…esthisconjectureusingadditionalnotationandassumptions. Foreachparametric submodel, de…nethetargetparametricfunction in terms oftheunderlyingparameters
¹(µ) = ¹(¯;h(´)); (15) whereµ= (¯0;´0)0and we have dropped thesuperscriptiindexingthe various parametric families.
A ssumption5: Forallparametric submodels, Eµ[kï(z )k2]exists and continuous on a
neighborhood ofµ0.
A ssumption6:Forallregularparametricsubmodels,¹(µ)di¤ erentiableandEµ[km (z t;¯0)¡
¹0k2]exists and continuous on aneighborhood ofµ0.
T heorem2: SupposeA ssumptions 1-6aresatis…ed,regular^¯ exists, E [S¢S0]exists and
nonsingular, and[Vm + M V¯¤M 0]is nonsingular,then¹bis regularandattains
thesemipara-metric e¢ ciencyboundV¹¤= Vm + M V¯¤M 0for¯bsemiparametrice¢cient.
T hus weseethatthemethodofmoments estimatorattains thesemiparametrice¢ciency
bound when based on¯bsemiparametric e¢cient, as was conjectured above. T he above
theorems provides a more directalternative torelated results in B rown and N ewey(19 9 8 ). T he results there targetedE ¯;h[m (z ;¯)]with more generalforms ofm (¢) and reduced to
thepresentresults undertheassumption ofindependence andm (z ;¯) = g(½(z ;¯);¯). T he
basicdi¤erence is thatcondition (d) in T heorem 2 there, which guarantees asymptoticin-dependence with respecttothe nuisance parameters is notneeded in the presentcontext. In addition severaloftheconditions there can becombined intoasinglesimplercondition.
Finally, thedirectapproachtakenhereavoids theneedtoconsiderthetheoryofV-statistics,
3. P rediction Intervals
Inthis section, theestimationofpredictionintervals and theirprobabilitycontentis consid-ered. In the presentation ofthis section, we willnotdiscuss the mostusualapproaches to constructingintervals such as intervals symmetricaround theconditionalmean orintervals with equaltailprobabilities. Instead, the focus is on the construction and estimation of optimalintervals and regions, which willgenerally di¤erfrom the usualapproaches. T he approach considered in the followingsubsection can be easily adapted to handle the con-struction ofintervals symmetric around the mean. In any event, forthe cases where the usualapproaches makethemostsenseand turnouttobeoptimal, such as thelinearmodel with normaldisturbances, the optimalapproaches introduced below willturn out to be asymptoticallyequivalent.
3.1 Known Interval, Estimated P robability
W estartbyinvestigatingtheestimation oftheprobability ofaknown intervalfor, without loss ofgenerality, the…rstendogenous variable. Considerthehalf-openinterval(c1;c2], and
de…ne Pk(c1;c2;x¿) = P r[(c1< y1·c2)jx¿] (16) = E [1(c1< ¼1(½(z ;¯0);x¿;¯0)·c2)jx¿] = Z 1(c1 < ¼1(½(z ;¯0);x¿;¯0)·c2)fz (z ;¯0;h0)d z
astheprobabilityofyfallingintheintervalgivenx¿. W eusethehalf-openintervalbecause
theprobabilitycanthenbewrittenas thedi¤erencein twoc.d.f.’s. O fcourseifthedensity is continuous, then the di¤erence in the probability contentbetween an open, closed, and half-openintervalis zero.
FollowingB rown and N ewey(19 9 8 ), thee¢cientestimateofthis conditionalexpectation undertheindependenceassumption is given bytheaverage
c
Pk= n¡1X n
i= 11(c1 < ¼1(b²i;x¿;b¯)·c2) (17 )
whereb²i= ½(yi;xi;b¯)and¯bisasemiparametricallye¢cientestimator. A ndtheasymptotic
limitingbehavioroftheestimatoris given byapplication ofT heorem 1 is n1=2(Pck¡Pk)¡!d N (0;Pk(1¡Pk) + Pk
¯V¯P¯k0) (18 )
wherePk
¯ = @E z [1(c1 < ¼1(½(z ;¯);x¿;¯)·c2)]=@¯0j¯=¯0. B y T heorem 2, the covariance
matrix ofthis limitingdistribution represents thesemiparametrice¢ciency bound foresti-mation oftheprobabilitycontentoftheknown interval(c1;c2]forV¯ = V¯¤and is attained
whenb¯ is semiparametrice¢cient.
3.2 O ptimalP robabilityInterval(G iven L ength)
Iftheinterval(c1;c2]is arbitrarilychosen, thenitcanlikelybeimprovedupon. Speci…cally,
wecanoften…ndanintervalofsimilarlengthA= c2 ¡c1thathashigherprobabilitycontent.
choosingtheintervalofgiven lengthAthathas highestprobability content:
m ax
c1;c2
Pk(c
1;c2;x¿;¯0;h0);s:t:c2 ¡c1 = A. (19 )
Ifthe conditionaldensityfyjx¿(¢jx¿;¯0;h0)exits and is continuous, then the…rstordercon-ditions forthis optimization arefyjx¿(c2jx¿;¯0;h0) = fyjx¿(c1jx¿;¯0;h0) togetherwith the sideconditionc2 ¡c1= A, which implicitlyde…nes theuniquesolutionsc¤1= c1(A;x¿;¯0;h0)
andc¤2 = c¤1 + A. Substitution ofthe optimalintervalintothe probability function yields
P¤(A;x¿;¯0;h0) = Pk(c¤1;c¤1+ A;x¿;¯0;h0)astheprobabilitycontentoftheoptimalinterval.
Inordertoapplythemethod ofmomentsapproachtoestimatetheprobabilitycontentof
theoptimalinterval, wemust…rstestimatethenuisanceparameterc¤
1. L etfbyjx¿(¢;x¿)denote a consistentestmator, such as the kernel, offyjx¿(¢;¯0;h0)and takecb¤1 = cb1(A;x¿)as the solution tothe implicitfunctionfbyjx¿(cb¤1+ A;x¿) = fbyjx¿(cb¤1;x¿). T hen afeasible estimator
oftheprobabilitycontentofthe estimated optimalintervalcan beestimated by c
P¤= n¡1X n
i= 11(cb
¤
1< ¼(½(yi;xi;b¯);x¿;¯b)·cb¤1+ A). (20)
Interestingly, the limitingdistribution ofthis estimatoris thegiven by n1=2(Pc¤¡P¤)¡!d N (0;P¤(1¡P¤) + P¤
¯V¯P¯¤)
whereP¯¤= @E [1(c¤1 < ¼1(½(z ;¯);x¿;¯)·c¤1+ A)jx¿]=@¯0j¯=¯0, which is the sameas ifc¤1
were known with certainty. T hus, forb¯ semiparametrice¢cient, Pc¤is the semiparametric
e¢cientestimatoroftheprobabilitycontentofthetrueoptimalinterval, which is unknown butconsistently estimated by(cb¤
1;cb¤1+ A]
U ltimately, ofcourse, we are interested in the coverage probability ofthe estimated
in-tervalrelative tothe estimated probability. Foryoutside the estimation sample, which is
appropriateforoutsidesampleprediction, andhenceindependentof(cb¤ 1;cb¤1+ A], wecanshow P r[cb¤1 < y·cb¤1+ Ajx¿]= E [E [1(cb¤1< ¼(";x¿;¯0)·cb¤1+ A)jcb¤1;x¿]jx¿] (21) = P¤+ E [op(n¡1=2)]= P¤+ o(n¡1=2) providedfbyjx¿(¢;x¿) = fyjx¿(¢;x¿;¯0;h0)+op(n¡ 1=4)andhencecb¤ 1= c¤1+op(n¡1=4). Combining thetworesults, we…ndthat c P¤ = P r[cb¤ 1< y·cb¤1+ Ajx¿]+ n¡ 1=2 N (0;P¤(1¡P¤)+ P¯¤V¯P¯¤)+ op(n¡1=2) (22) = P r[cb¤ 1< y·cb¤1+ Ajx¿]+ Op(n¡1=2)
with the discrepancy between the trueand estimated probability ofthe estimated interval resultingfrom estimatingthetrueprobabilitycontentofthetrueoptimalinterval.
G iven intervallength, the motivation forusingthe optimalintervalformakinga prob-ability statementis clear. T he complication is thatwe mustestimate theendpoints ofthe intervalas wellas the probability content. A lthough the endpointestimators converge to
theirtargets ata rate slowerthanpn, the estimated probabilityPc¤willconverge tothe
probability contentoftheestimated intervalatapn rate. M oreover, itis easy toseethat
p
n(Pc¤¡P r[cb¤
a lowerbound. T hus, Pc¤provides a semiparametrice¢cientestimatorforthe probability
contentofthetrueandestimated optimalintervals.
A ttainingafasterthann1=4 rateofconvergenceforthekernelestimatorfb
yjx¿(¢;x¿)may
be a problem ify¿ and/orx¿ are ofhigh dimension. T hedimensionality intoduced by the
conditioningvariablescanbeeliminated, however, byusingarestrictedkernelestimator. D ue totheindependenceassumption, wehavefyjx(yjx) = f"(½(y;x;¯))jd et (@½(y;x;¯)=@y)jand
thecorrespondingestimator b fyjx(yjx) = fb"(½(y;x;¯b)) ¯ ¯ ¯d et³@½³y;x;¯b´=@y´¯¯¯, (23) wherefb"(¢)is the kernelestimatorofthe density of"and willnotsu¤erfrom the
dimen-sionality ofx. If" is a longvector, with more than three elements, then we willneed to utilizehigher-orderkernels toattain the required rate ofconvergence. A n added bene…tof usingtheso-restrictedkernelis thatthecorrespondingc.d.f. estimatorhas theproperties of
asmoothed unconditionalc.d.f.estimatorand willhaveapn rateofconvergence.
3.3 O ptimalL ength Interval(G iven P robability)
T hedualtotheaboveoptimization with respecttotheintervalis probablyofmoreinterest. Speci…cally, foragivenprobabilitycontent, wecanchoosetheintervaltobeminimallength. Continuingtoassume unimodalbehaviorthis problem can beformalized as
m in
c1;c2
A= c2 ¡c1; s:t:Pk(c1;c2;x¿;¯0;h0) = P (24)
which has as …rstorderconditionsfyjx(c2;x¿;¯0;h0) = fyjx(c1;x¿;¯0;h0)togetherwith the
sideconditionPk(c
1;c2;x¿;¯0;h0) = P. L etA¤denotethevalueofAattheminimum, then
c¤2 = c¤1+ A¤attheminimum andc¤1= c1(A¤;x¿;¯0;h0), which is thesameas before, bythe
…rst-orderconditions. Substitution intotheside condition yieldsA¤as theuniquesolution
totheimplicitequation P= Pk(c
1(A¤;x¿;¯0;h0);c1(A¤;x¿;¯0;h0) + A¤;x¿;¯0;h0) (25)
whilec¤1 = c1(A¤;x¿;¯0;h0) andc¤2 = c¤1 + A¤. T his is easily seen as the inverse function
tothesolutionofthemaximum probabilitygiven lengthproblem, presented in theprevious subsection, and inacertainsense is aquantile.
T he optimalendpoints can be estimated directly bycb¤
1 = cb1(Ac¤;x¿) andcb¤2 = cb¤1+ Ac¤
wherethecorrespondingestimated intervallengthAc¤solves theimplicitsystem
P= Fbyjx¿(cb1(Ac¤;x¿) + Ac¤;x¿)¡Fbyjx¿(cb1(Ac¤;x¿);x¿) (26) andFbyjx¿(¢;x¿)is thesmoothedestimated c.d.f. correspondingtofbyjx¿(¢;x¿). M oredirectly,
wecan use thedi¤ erenceintheempiricalc.d.f.’s and takeAc¤as thesolutionto
c A¤= sup A ³ n¡1X n i= 11(cb1(A;x¿)< ¼(½(yi;xi;¯b);x¿;b¯)·cb1(A;x¿) + A)·P ´ (27 ) which is theinverseofPc¤, theestimatedprobabilityfunction, given intheprevious
onlyasymptoticallyone-to-one.
In eithercase, analogous toabove, wecan showthat n1=2(Ac¤¡A¤) d
¡! N (0;fyjx(c¤1;x¿;¯0;h0)¡2fP(1¡P) + P¯V¯P¯g) (28 )
where nowP¯ = @E z [1(c¤1 < ¼1(½(z ;¯);x¿;¯) ·c¤1+ A¤)]=@¯0j¯=¯0 with the limiting
co-variance matrix representingasemiparametrice¢ ciency bound forestimation ofthe opti-mallength. A nd foroutsidesample prediction, providedfbyjx¿(¢;x¿) = fyjx¿(¢;x¿;¯0;h0) +
op(n¡1=4), we similarly …nd thatthe probability contentofthe estimated optimalinterval
with given probabilitycontentis given by
P r[cb¤1 < y·cb1¤+ Ac¤jx¿]= E [E [1(cb¤
1< ¼(";x¿;¯0)·cb¤1+ Ac¤)jcb¤1;Ac¤;x¿]jx¿] (29 ) = P+ E [fyjx(c1¤;x¿;¯0;h0)(Ac¤¡A¤)+ op(n¡1=2)jx¿]
= P+ E [n¡1=2N (0;P(1¡P) + P
¯V¯P¯)+ op(n¡1=2)jx¿]= P+ o(n¡1=2).
T hus, although the endpoints ofthe estimated optimalinterval(cb¤
1;cb¤2]have a slowerthan p
n rate ofconvergence, the probability contentofthe estimated intervalconverges tothe
ostensibleprobabilityatafasterthanpn rate.
3.4 M ultimodalD istributions
Intheabovediscussion, weassumedthattheconditionaldistributionofy1was unimodal. If
thedistribution is multimodalthen theapproachwillneed somemodi…cation. Speci…cally, wemustentertainthepossibilitythattheintervalwillbediscontiguouswithonesub-interval foreach mode. T hegeneraloptimalityapproach willstillworkwitheitherthetotallength ofthe intervals setand the probability contentmaximized orthe probability contentset
and the totallength ofthe intervals minimized. T he …rst-orderconditions willbe the
same with the density thesame atallthe endpoints ofthe subintervals. Itturns outthat thereis a more directapproach thatworks forboth theunimodaland multimodaloptimal prediction intervaland alsoforchoosingthe optimalprediction region forthe multivariate case. A ccordingly, I turn nowtotheprediction regionproblem.
4. Efficient P rediction R egions
In this section, theestimation ofprediction regions and theirprobability contentis consid-ered. T he objective, ofcourse, is tomake conditionalprobability statements regardinga vectorofendogenous variables ratherthan a scalarendogenous variable as in the previous section. A s in the previous section, we willonly discuss the construction and estimation ofoptimalintervals. T here are both advantages and disadvantages forprediction regions foracomplete vectorcompared toavectorprediction intervals applied toeach elementof thevector. T headvantage is thatarea ofthe region implied bytheunion oftheunivariate intervals is almostivariableylargerthan an optimallyconstructedregion. T hedisadvantage is thatprediction intervals are much easiertointerpretand more easily understood by the uninitiated.
4.1 Known R egion
W e …rstexamine the estimation ofthe probability contentofa known region. SupposeR
denotes some region in the space offeasible values ofy. T hen, analogous tothe interval
case, theprobabilitycontentoftheregion is given by
Pk(R ;x¿;¯0;h0) = P r[(y2R)jx¿] (30) = E [1(¼(";x¿;¯0)2R )jx¿] = E [1(¼(½(z ;¯0);x¿;¯0)2R)jx¿]. T hemethodofmomentsestimatoroftheprobabilitycontentPk(R;x ¿;¯0;h0) istheresidual-basedestimatorPck= n¡1P
i1(¼(b²i;x¿;¯b)2R)which, byT heorem 1, willhavethelimiting
behavior
n1=2(Pck¡Pk)¡!d N (0;Pk(1¡Pk)+ Pk
¯V¯P¯k), (31)
whereP¯k= @E [1(¼(½(z ;¯);x¿;¯)2R )jx¿]=@¯0j¯=¯0. W ith some regularity, as indicated by
T heorem 2, the covariancematrix ofthis estimatoris the semiparametrice¢ciency bound
forestimation oftheprobabilitycontentoftheknown regionwhenV¯ = V¯¤. T his e¢ciency
bound willbeattained bythemethod ofmoments estimatorifb¯ is semiparametrice¢cient.
4.2 O ptimalP robabilityR egion (G iven A rea)
Considerthe choice ofan optimalregion given thattheregion has agiven area orvolume.
Firstweneed togivesomestructuretothe choiceoftheoptimalset. L etAdenotetheset
ofB orel-measurable sets with volumeA, then ourproblem is choosingfrom amongAthe
setR ¤withmaximalprobability measure. T hatis,
R¤= argm ax
R2A P
k(R;x
¿;¯0;h0). (32)
whereA = fB 2 B(Rg) :V(B) = Ag. U ndersu¢cientsmoothness, anecessary condition thatthemaximizingsetmustsatisfy is thatitbeamemberofthelevelsets ofthedensity, which arede…nedbyR (q) = fy:fyjx(y;x¿;¯0;h0)¸qgforanychoiceofthelevelq.
N ote thatV(R(q)), thevolume ofsuch regions, is monotonically decreasinginq. Ifthe
monotonicity is strict then we can …ndq¤as the solution to the implicitfunctionA =
then we have
q¤= q(A;¯0;h0) = inf
q f
R
1(fyjx(y;x¿;¯0;h0)¸q)d y·Ag (33)
and theoptimalregion is given byR ¤= fy:fyjx(y;x¿;¯0;h0)¸q(A;¯0;h0)g. Substitution
from thede…nitions ofq¤and correspondinglyR¤into(30) yields
P¤(A;x;¯0;h0) = P r[(y2R ¤)jx¿] (34)
= E [1(fyjx(¼(";x¿;¯0);x¿;;¯0;h0)¸q(A;¯0;h0))jx¿]
= E [1(fyjx(¼(½(z ;¯0);x¿;¯0);x¿;;¯0;h0)¸q(A;¯0;h0))jx¿]
as theprobability contentoftheoptimalregion.
O perationally, we need toestimateq¤and henceR¤, and the probability contentofthe latter. T hecomplication with the…rstis the need toperform amultidimensionalintergral. T his maybeavoidedbytransformingtoexpectations and usingaverages
b q¤= inf
q fn
¡1Pn
t= 1[1(fbyjx(y;x¿)¸q)=fbyjx(y;x¿)]·Ag (35)
wherefbyjx(¢)is aconsistentestimators ofthemultivariateconditionaldensity. T heoptimal
region with areaAmay beestimated byRc¤= fy:fbyjx(y;x¿)¸qb¤gand thecorresponding
probability by
c
P¤= n¡1Pn
t= 11(fbyjx(¼(½(z t;¯b);x¿;¯b);x¿)¸qb¤)). (36)
In practice, qb¤as given by (35) can be obtained by a binary search since the estimated
volumeis alsomonotonicby de…nition.
N ote thatthe nuisanceparametersfbyjx(¢)and henceqb¤willboth beconsistentbuthave
slowerthann¡1=2 rates ofconvergence. N onetheless, thelimitingbehavioroftheprobability
contentestimatoris given by
n1=2(Pc¤¡P¤)¡!d N (0;P¤(1¡P¤) + P¤
¯V¯P¯¤) (37 )
whereP¤is de…ned immediately above andP¯¤= @E [1(fyjx(¼(½(z ;¯);x¿;¯);x¿;;¯0;h0)>
q(A;¯0;h0))jx¿]=@¯0j¯=¯0, which is the sameas iftheoptimalregion were known with
cer-tainty. Furthermore, forb¯ semiparametrice¢cient, the method ofmoments estimatorwill
attain the semiparametrice¢ciency bound fortheestimation ofthe probability contentof
the known optimalregion. A nalogous to the intervalresults, we …nd thatPc¤converges
toP r[y2Rc¤jx¿], the coverage probability ofthe estimated region, atthe raten¡1=2 and
moreoverthatn1=2(Pc¤¡P r[y2Rc¤jx
¿])attains asemiparametrice¢ciencybound, provided
thatfbyjx(¢)andqb¤convergetotheirtargets ataratefasterthann1=4.
4.3 O ptimalA rea R egion (G iven P robability)
W earelikelymoreinterestedinthedualproblem ofchoosingaregionwithgivenprobability
soas tominimizetheareaorvolumeoftheregion. L etP denotethesetofB orel-measurable
with minimalvolume. Formally, wehave
R¤= argm in
R2PV(R), (38 )
whereP = fB 2 B(Rg) :M (B) = AgandM (¢)is probability measure. A s above, under
su¢cientsmoothness conditions, the minimizingsetmustbea memberofthe levelsets of the density. Since the probability measureM (R(q))is alsomonotonicincreasinginq, we have
q¤= q(P;¯0;h0) = inf
q f
R
1(fyjx(y;x¿;¯0;h0)¸q)fyjx(y;x¿;¯0;h0)d y·Pg (39 )
and the optimalregion is given byR ¤= fy:fyjx(y;x¿;¯0;h0)¸q(P;¯0;h0)g.
Substitu-tion into the volume operatoryieldsA¤(P;¯0;h0) = V(R (q¤)) =
R
1(fyjx(y;x¿;¯0;h0) ¸
q(P;¯0;h0)))d y.
O perationally, wedonotneed toestimateA¤since itwillbegivendirectlyas aresultof
estimatingq¤. Substitution ofa sampleaverage fortheexpectation in (39 ) and solvingfor
qin terms ofPyields thefollowingestimatorforq¤: b
q¤= inf
qfn
¡1Pn
t= 11(fbyjx(¼(½(z t;b¯);x¿;b¯);x¿)¸q)·Pg. (40)
B utthis estimatoris justthe approximate inverse function forthe estimated probability given by(27 ) in theprevious subsection. Correspondingtotheresults forintervals, we…nd that
n1=2(qb¤¡q¤)¡!d N (0;f
yjx(q¤;x¿;¯0;h0)¡2fP(1¡P)+ P¯V¯P¯g) (41)
where nowP¯ = @E [1(fyjx(¼(½(z t;¯);x¿;¯);¯0;h0)¸q¤)]=@ ¯0j¯=¯0. G iven ourestimatorof
q¤, the estimated region is given directly asRc¤= fy:fbyjx(y;x¿)> qb¤g. N ote thatqplays
much thesame roleas aquantilein theunivariatecase.
Intheend, weareinterestedinP r[fbyjx(¼(½(z t;¯b);x¿;b¯);x¿)> qb¤jx¿],
thecoverageproba-bilityoftheestimatedregion, relativetothegivenprobabilityP. Followingthedevelopment
in theprevious sectionforintervals, providedfbyjx¿(¢;x¿) = fyjx¿(¢;x¿;¯0;h0)+ op(n¡
1=4), we
have
P r[fbyjx(y;x¿) > qb¤jx¿]= E [E [1(fbyjx(¼(";x¿;¯0);x¿)> qb¤jqb¤;x¿]jx¿] (42)
= P+ E [fyjx(q¤;x¿;¯0;h0)(qb¤¡q¤)+ op(n¡1=2)jx¿]
= P+ E [n¡1=2N (0;P(1¡P) + P¯V¯P¯)+ op(n¡1=2)jx¿]= P+ o(n¡1=2).
T hus, although theboundries oftheestimated prediction region haveslowerthanpn rates
ofconvergence, theprobabilitycontentoftheestimatedregionmayconvergetoitsostensible
5. Sampling Experiment
In the previous two sections, we have examined the asymptotic behaviorof the various prediction intervaland regions. It is ofobvious interest whether or not the asymptotic properties alsoobtaininsmallsamples. Inthis sectionweshallattempttoaddressthis issue byconductingasamplingexperimentforanonlinearsimultaneous system. Inordertokeep thecalculationproblem manageable, themodelis extremelysimpli…ed and has somerather specialproperties. A s aresult, the …ndings regardingthesmallsampleperformance ofthe various estimators arenotnecessarilyapplicabletomore realisticmodels. N evertheless, the studyshould givesomeindication oftherelativeperformanceofthealternative procedures in smallsamples. In particular, weare interested in howquickly the large sample relative e¢ciencies assertthemselves in this model.
Forthesamplingexperiment, weutilizethefollowingtwo-equationnonlinearmodel yt1 = ¯1+ ¯2xt+ ut1
yt2 = ¯3+ ¯4yt21+ ¯5xt+ ut2
where(ut1;ut2)0»i:i:d.N(0;§ ),§ = (¾ij). A specialfeatureofthismodelistheavailability
ofaclosed-form solution:
yt1 = ¯1+ ¯2xt+ ut1
yt2 = ¯3+ ¯4(¯1+ ¯2xt+ ut1)2 + ¯5xt+ ut2.
A s aresult, themoments ofy¿ are readilyobtainablein closed form.
6. Concluding R emarks
Inthispaperwehavestudiedalternativeproceduresforobtainingpredictionintervalsand =or regions in nonlinearsimultaneous systems with independent disturbances. T he need for attachingprobability values toourpredictions is ofobvious importance and has received considerableattention in boththelinearregressionmodeland linearsimultaneous equation model. A lthoughasubstantialfractionofthemodels usedforpredictionarenonlinearsimul-taneoussystems, verylittleattentionhasbeengiventowhatproceduresmightreasonablybe usedtogenerateprobabilityvalues. T heworkpresented in this paperapplies theresults on semiparametrice¢cientestimation ofexpectation functions by B rown andN ewey(19 9 8 ) to thepredictioninterval/regionproblem andtherebyextendsthepreviousunpublishedworkof B rownandM ariano(19 9 1)onpredictionintervals andregions. T helatterwas onlypartially semiparametricsince the parameterestimatorwas assumed tobe equivalenttomaximum likelihood.
T heconstructionofpredictionintervals is examinedinSection3. T heapproachofB rown and N ewey is applied toobtain the semiparametrice¢ciency bound forestimation ofthe probabilitycontentofaknown interval. T helimitingdistributionofthemethodofmoments estimatoroftheprobability contentis developed and shown toobtain the e¢ciency bound when based on semiparametrice¢cientparameterestimates. O ptimalprediction intervals thatmaximize the probability contentofthe intervalgiven the intervallength arestudied and feasibleestimators oftheintervaland theirprobabilitycontentdeveloped. T hefeasible estimatorsoftheprobabilitycontentisshowntobeasymptoticallyequivalenttoanestimator based on the true optimalinterval. T he estimated probability is shown todi¤erfrom the coverageprobabilityoftheestimatedintervalbytermsoforderand, moreover, thedi¤ erence attains a lowerbound when the method ofmoments probability estimatoris based on a
semiparametrice¢cientestimatorof¯. T hedualproblem ofminimizingtheintervallength
given theprobability contentoftheintervalis alsoconsidered. A feasible estimatorofsuch an intervalis developed its asymptoticbehaviorexamined. T hecoverageprobabilityofthe estimated intervalis shown todi¤erfrom theostensibleprobabilitybyterm ofordersmaller
thanpn.
T he construction ofprediction regions is studied in Section 4. T helimitingbehaviorof themethodofmoments estimatoroftheprobabilitycontentofaknown regionis developed and shown to attain the semiparametric e¢ciency bound when based on semiparametric e¢cientestimates ofthe parameters. T he construction and estimation ofoptimalregions which maximizeprobability contentgiven region areaorvolume is examined. T heoptimal regionsareshowntobelevelsetsoftheconditionaldensityoftheendogenousvariablesgiven theexogenous variables. Feasibleestimators thatare based on nonparametricestimators of the density and method ofmoments estimators ofthe probability contentare devised and shown toattain the semiparametice¢ciency bound forestimatingthe probability content ofa known optimalregion when based on semiparametrice¢cientestimates of¯. R egions which minimize theareaorvolumeoftheregion given theprobabilityarealsostudied and feasibleestimators devised. T heasymptoticbehaviorofthefeasibleestimators oftheregion is developed. A s withtheintervalcase, thecoverageprobability oftheestimatedintervalis
shown todi¤ erfrom theostensibleprobabilitybyterm ofordersmallerthanpn.
T he results ofasamplingexperimentarepresented in Section 5. (Tobecompleted) T here are a numberofdirections in which the research outlined in this section can be
extended. T heformalmodelanalyzed is i.i.d., whilemostpredictive models aredynamicin nature.Itappears thattheresults can beapplied prettymuch directlytostationarymodels with i.i.d. and independentinnovations, butanumberofdetails need tobe worked outto formalize theextension. A notherinterestingextension is tofully nonparametricprediction intervals and regions. T he approach outlined above can be utilized to calculate optimal unconditionalprediction intervals formodels with nosystematiccomponent. T hatis, we
have no modelforybutseek toconstructoptimalprediction intervals and regions using
estimated distributions. Itappears thatsuch estimated intervals and regions willalsobe asymptoticallyindependentofthenuisanceparameters ofthedensityestimator.T his result should beofgreatinterestand needs tobe worked outin greaterdetail.
R eferences
A ndrews, D .(19 9 4): ”EmpiricalProcessM ethodsinEconometrics,” inH andbookofEcono-metrics, V ol. 4, ed. by R . Engle and D .M cFadden, A msterdam: N orth H olland, 2247 -229 4.
B rown, B . and R . M ariano(19 8 4), “R esidual-B asedStochasticP rediction and
Estima-tion in aN onlinearSimultaneous System,”Econometrica, 52, 321-343.
B rown, B . and R . M ariano(19 8 9 ), “P redictors in D ynamicN onlinearM odels: L arge-SampleB ehavior,”Econometric T heory, V ol.5, N o.3.
B rown, B . and R . M ariano(19 9 1), “Prediction Intervals and R egions in N onlinear Simultaneous Systems,” M imeo, D epartmentofEconomics, R ice U niversity.
B rown, B ., and W . N ewey(l99 8 ): “E¢cientSemiparametricEstimation
ofExpecta-tions,”Econometrica66,453-464.
H owrey, P ., and H . Kelejian, “Simulation Versus A nalyticalSolutions: T he Case of
EconometricM odels,” inT .H .N aylor, ed.,ComputerSimulationExperimentswithM
od-els ofEconomic Systems, (N ewYork: J.W iley, 19 7 1).
M ariano, R ., and B . B rown, “A symptoticB ehaviorofPredictors inaN onlinearSimul-taneous System,”InternationalEconomic R eview, 24 (19 8 3), 523-536.
N ewey, W .(19 9 0): “SemiparametricE¢ciencyB ounds,”JournalofA ppliedEconometrics 5, 9 9 -135.
N ewey, W .(19 9 4): “T heA symptoticV arianceofSemiparametricEstimators,” Economet-rica62, 1349 -138 2.
N ewey, W ., and D . M cFadden(19 9 4): “Estimation andInferenceinL argeSamples,” in H andbookofEconometrics, V ol. 4, ed.byR .EngleandD .M cFadden, N ewY ork: N orth H olland, 2111-2445.