• No results found

Fitting outbreak models to data from many small norovirus outbreaks

N/A
N/A
Protected

Academic year: 2021

Share "Fitting outbreak models to data from many small norovirus outbreaks"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

ContentslistsavailableatScienceDirect

Epidemics

jo u rn al h om ep age : w w w . e l s e v i e r . c o m / l o c a t e / e p i d e m i c s

Fitting

outbreak

models

to

data

from

many

small

norovirus

outbreaks

Eamon

B.

O’Dea

a,∗

,

Kim

M.

Pepin

b,c

,

Ben

A.

Lopman

d,1

,

Claus

O.

Wilke

a,e

aSectionofIntegrativeBiology,UniversityofTexasatAustin,1UniversityStationC0930,Austin,TX78712,USA bFogartyInternationalCenter,NIH,Bethesda,MD20892,USA

cDepartmentofBiology,ColoradoStateUniversity,FortCollins,CO80523,USA

dGastrointestinal,EmergingandZoonoticInfectionsDepartment,CentreforInfections,HealthProtectionAgency,LondonNW95EQ,UK

eCenterforComputationalBiologyandBioinformaticsandInstituteforCellandMolecularBiology,UniversityofTexasatAustin,Austin,TX78712,USA

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received11June2012

Receivedinrevisedform23October2013 Accepted23December2013

Availableonline8January2014 Keywords:

Stochasticepidemicmodel Parameterestimation Norovirus

Health-care-associatedinfection Generalizedlinearmodel

a

b

s

t

r

a

c

t

Infectiousdiseaseoftenoccursinsmall,independentoutbreaksinpopulationswithvarying

character-istics.Eachoutbreakbyitselfmayprovidetoolittleinformationforaccurateestimationofepidemic

modelparameters.Hereweshowthatusingstandardstochasticepidemicmodelsforeachoutbreakand

allowingparameterstovarybetweenoutbreaksaccordingtoalinearpredictorleadstoageneralized

lin-earmodelthataccuratelyestimatesparametersfrommanysmallanddiverseoutbreaks.Byestimating

initialgrowthratesinadditiontotransmissionrates,weareabletocharacterizevariationinnumbers ofinitiallysusceptibleindividualsorcontactpatternsbetweenoutbreaks.Withsimulation,wefindthat theestimatesarefairlyrobusttothedatabeingcollectedatdiscreteintervalsandimputationofabout halfofallinfectiousperiods.Weapplythemethodbyfittingdatafrom75norovirusoutbreaksin health-caresettings.Ourbaselineregressionestimatesare0.0037transmissionsperinfective-susceptibleday,

aninitialgrowthrateof0.27transmissionsperinfectiveday,andasymptomaticperiodof3.35days.

Outbreaksinlong-term-carefacilitieshadsignificantlyhighertransmissionandinitialgrowthratesthan outbreaksinhospitals.

©2014TheAuthors.PublishedbyElsevierB.V.

Introduction

Acommonanddifficultprobleminepidemiologyistoestimate ratesofdiseasespread.Accurateestimatesoftheseandother pop-ulationparametersarecrucialintheevaluationofdiseasecontrol measures(AndersonandMay,1992;Keeling,2005;Halloranetal., 2009)orbiologicalhypotheses(Lively,2010).Heterogeneity com-plicatestheproblemofobtainingsuchestimates.Forexample,a person’sriskofinfectiondependsoncontactratesandacquired immunity,andthesequantitiescanvarywidelybetweenpeople andoutbreaks.

Norovirus(NoV)epidemiologyprovidesafinecaseinpointof theneedformodelstoaccommodateheterogeneity.Noroviruses arethe mostcommoncause ofdiarrheal diseasein theUnited

∗Correspondingauthor.Tel.:+12027847182.

1 Presentaddress:DivisionofViralDiseases,NationalCenterforImmunization

andRespiratoryDiseases,CentersforDiseaseControlandPrevention,Atlanta,GA 30030,USA.

States,causinganestimated21millioncases(Scallanetal.,2011) and 71,000 hospitalizations per year (Lopman et al., 2011). A geneticallydiversegroupofstrainsisoftencirculatingwithina population.New strainsof thepredominantgenogroup2 geno-type4(GII.4)taxonappearregularlyovertime(Glassetal.,2009), and a person’sriskof infection,given exposure, likely depends onboththeantigenicityofthevirusandthetype-specific immu-nitydevelopedfromtheperson’spreviousexposure(Cannonetal., 2009).Otherimportantheterogeneitiesinclude innate suscepti-bility(whichdepends onaperson’shisto-bloodgroupantigens and secretor status) and age-specific risks of exposure. Out-break investigations (Evans et al., 2002; Thornley et al., 2011; Wikswoetal.,2011)haveprovidedconvincingevidencethat sin-glevomitingincidentsincrowdedsettingscanleadtoscoresof secondarycases.Modelsthataccountforbothbetween-individual andbetween-populationheterogeneityareneededtoobtainthe accurate parameter estimates required for predicting outbreak dynamicsand implementingeffectivecontrols.Atpresent, con-trolmeasures arebased ongeneral infection-controlprinciples (Centers forDiseaseControlandPrevention,2011)andthusare likelytobesomewhatinefficient.

A further complicationfor modeling norovirus transmission isthat itoftenoccursin smalloutbreaks.Thetransmissionand 1755-4365©2014TheAuthors.PublishedbyElsevierB.V.

http://dx.doi.org/10.1016/j.epidem.2013.12.002

Open access under the CC BY-NC-SA license.

(2)

recoverytimesofcasesin smalloutbreaksarecorrelated(Rida, 1991),whichmakesestimationdifficultwhenusingdatafroma singleoutbreak.Anobvioussolutiontoreducingtheinaccuracy causedbywithin-outbreakcorrelationsindataistobaseestimates ondatafrommultipleoutbreaks.

Methods forestimating parameters frommultipleoutbreaks havebeendescribed before, butoften havebeendeveloped for smallerdatasetsandcomputingresourcesthanwhatarenow avail-able.Forexample,thepreviousapproachesofBecker(1979)and Becker(1991)assumedonlytheobservationofthefinalstateof eachoutbreak wasavailable,usedmoments estimators,didnot formulategeneral a regression model to allowfor variation in parameters betweenoutbreaks,and may beimplemented with penciland paper.Our norovirus outbreak datasetincludesthe fullobservationofalargenumberofoutbreaksandanumberof covariatesthatarelikelytoaffectparameters.Wethushereemploy adifferentmethodthatoperates onthefullobservationof out-breaks,uses maximum-likelihoodestimators,models theeffect ofcovariatesonoutbreakparameterswithinageneralregression framework,and exploitsmoderncomputingpowertofind esti-matesandtheirconfidenceintervals.

Weproposeageneralapproachtofittingdatafrommanysmall outbreaks.Usingsimulateddata,weassessedtheperformanceof theproposedmethodasa functionofthenumberofoutbreaks inthedata,theroundingofmeasurementstoregularintervalsof observation,thenumberofmissingobservations,andthe imputa-tionofmissingobservations.Whenthenumberofoutbreakswas large,wefoundtheperformancetobesatisfactoryfordatasetswith realisticlevelsofallofthesechallengingfeatures.Fittingourmodel todatafromalargenumberofrealnorovirusoutbreaksin health-carefacilities, wefounda distinctincrease in transmissionand initialgrowthratesinlong-term-carefacilitiesrelativetohospitals. Weexaminedthefitofthemodelandfoundthemostnoticeable defecttobelower-than-observedpredictionoftheinitialgrowth oftheoutbreaks.However,thepredicteddynamicsbecamemore accurateovertime suchthatpredictionsnever deviatedwidely fromobservations.

Methods

Wedeveloped themethodsdescribed inthis sectiontofit a modeloftheoutbreakdynamicsofnorovirusbasedondatafroma largesurveyofgastroenteritisinhealth-carefacilitiesintheformer Countyof Avon, England.In this study,theeventsof symptom onsetandrecoverywererecordedonadailybasisforcasesof gas-troenteritisinbothcarestaffandpatientsin15hospitalsand135 long-term-carefacilitiesoverayear-longperiodin2002–2003,and theseeventswereclassifiedintoatotalof271separateoutbreaks (Lopman etal.,2004b).Theseoutbreakswereforthemostpart small;therangeintotalcasesspansfrom2to90casesandthe medianis13cases.

We begin by presenting our estimation methods. With the methoddefined,we thendescribe assumptionsand imputation proceduresusedtoprepareourdataforapplicationofthemethod. Tocompletethemodelspecificationforourapplication,wenext describethevariablesofthedatachosentobepredictorsofhow parametersvaryamongoutbreaks.Finally,weprovidedetailsabout methods of simulation, calculation of confidence intervals, and choiceofsoftware.

Model

Althoughouraimistointroduceageneralapproach,weaimto dosobywayofexample.Thuswedescribeourmethodsintermsof aspecificmodelchoicemadeforthenorovirusdata.However,we

doprovidereferencestorelevantresultsintheregressionliterature toindicatethefullscopeofthisapproach.

The states and transition rules for the model we adopt for individualoutbreaksareasfollows.Thepopulationconsistsofa fixednumberofpeopleofoneormoretypes.Thetermtypehere identifiespeoplebytherulesgoverningtheirmovementbetween differentstateswithrespecttonorovirusinfection.Atthe begin-ningofanoutbreak,thereissomepositivenumberofpeoplein

anexposed,orlatent,stateforatleastoneofthetypes.Thisstate

representspeoplewhohavebeenexposedtoaninfectionsource andhavealatentinfectionbutarenotcontagious.Theymoveto

aninfectivestateafteranincubationperiodoffixedduration.The

infectivestaterepresentscontagiouspeople,andforsimplicitywe assumethatallcontagiouspeoplearesymptomatic.Asusceptible staterepresentspeoplewhoaresusceptibletoinfection.Thuseach susceptibleoftypeimovestothelatentstateatthefirstpointofa PoissonprocesswithrateˇiY(t),whereˇiisthetransmissionrate fortype-isusceptiblesandY(t)isthenumberofinfectivesattimet. Allinfectivetypeshavethesamelevelofcontagiousnessandhave gamma-distributedsymptomaticperiodswiththesamedispersion parameter,butthemeansymptomaticperiodmaydifferbetween types.Further,typesthatrepresentcarestaffaremovedintoan

infective-but-removedstatewhenthetimetheyhavespentinthe

infectivestateexceedsathresholdoffixedduration.Thistransition rulerepresentstheeffectofinfection-controlpoliciesthatprevent stafffromworkingwhencontagious. Attheend oftheir symp-tomaticperiods,infectiveand infective-but-removedpeopleare movedintoarecoveredstate.Therecoveredstaterepresents indi-vidualsthatgainimmunityoverthecourseoftheoutbreak.The outbreakendswhenthenumberofinfectedpeoplereacheszero.

In summary, our outbreak model is the widely studied susceptible-exposed-infective-recovered(SEIR) model with four customizationsforourapplication.First,weallowpeopletovaryin susceptibilityandexpecteddurationofinfectiousness.Second,we donotmakeourtransmissionratedependonthetotalnumberof peopleinthepopulation.Thisdeparturepreventstheneedforthe totalnumberofpeopletobeestimated,anditisappropriateinsmall populationswhenaninfectivepersonmaybeabletoinfectevery susceptiblepersoninthepopulationwithapproximatelythesame probability.Forexample,ForresterandPettitt(2005)didnotfind thatinclusionofthetotalpopulationsizesignificantlyimproved thefit of a modelof methicillin-resistant Staphylococcusaureus (MRSA)outbreakswithinanintensive-careunit.Third,wedonot assumethatlatentperiodsandinfectiousperiodsareexponentially distributed.Ourapproachismorerealisticbecauseitallowsthe probabilityofapersonleavingalatentorinfectiousstatetodepend onhowlongshehasbeeninthatstate.Fourth,weshuntsomeof theinfectivesintoaninfective-but-removedstatetorepresentthe isolationofcontagiousstafffromthepopulation.

Asindicatedinouroutbreakmodeldescription,therateatwhich asusceptibleacquiresinfectionfromaninfectivemayvaryamong membersofapopulation,andweusethewordtypeinageneral sensetorefertosubsetsofthepopulationthatareassumedtobethe samewithrespecttosuchvariation.Withmultiple-outbreakdata, wefurtherdefinetypesasuniquetoindividualoutbreaks.Inother words,wemakenogeneralassumptionthatpeopleindifferent outbreaksmaybemodeledwiththesameparameters.Weshall laterchooseaparticularlinearmodelthatcontrolstheextentto whichparametersmayvaryamongtypes,butmanyotherchoices forsuchmodelsarepossiblewithinthisframework.Typesthus representthefundamentalunitofvariationinthisframework,and thelikelihoodfunctionnaturallybreaksapartintofactorsforeach type.

Foreachtype,therecovery-timeandtransmission-timepartsof thelikelihoodsfurtherfactorapartintocommondensityfunctions. Thesimplicityofthesefunctionsbeliesaninvolvedconstruction,

(3)

availableinKalbfleischandPrentice(2002),astheproductintegral ofthelikelihoodofeventsininfinitesimaltimesteps,wherethe likelihoodofeachtime stepisconditionalonthehistoryofthe modelupuntilthattimestep.Weshallintroducethefulllikelihood byintroducingeachofthesefunctionsinturn.

Fortype-ipeople,therecovery-timepartofthelikelihoodis

lrec(i,)= ki

j=1 1 (1/)(i)1/ I1/−1 i,j exp −Ii,j i, (1)

wherekiisthenumberoftype-ipeopleinfectedoverthecourse ofanoutbreak,Ii,jdenotesthelengthofthesymptomaticperiodof thejthtype-iinfection,iisthemeanofthesymptomaticperiodof type-iinfections,andisthedispersionparameter,whichwetake tobethesameforalltypesofinfections.Eq.(1)representsthe likeli-hoodfunctionforajointdistributionofgamma-distributedrandom variables.Recallthatperourmodeldefinition,thesymptomatic periodsIi,jaregammadistributed.

Thetransmission-timepartofthelikelihoodfortype-ipeopleis

ltr(ˇi,X(0)i )=X (0) i !/(X (0) i −ki)!exp[−ˇii(Xi(0)−ki)] × ki

j=1

ˇiYi,jexp(−ˇihi,j), (2)

whereXi(0)isthenumberofinitialsusceptibles,iisthecumulative exposureofsuchpeopleattheendofanoutbreak(i.e.,thetotalarea underY(t)),Yi,jisthenumberofinfectivespresentwhenthejthsuch personbecomesinfected,hi,jisthecumulativeexposureofthejth suchpersonwheninfected.Furtherdiscussionofthis likelihood functionisprovidedintheAppendix.

Inmanycases,convertingthedatatoaminimallysufficientform maybedesirableforthepreservationofpatientprivacy.An exam-pleofsuchaformwouldbetosummarizethedataas

jhi,j,

jIi,j, ki,andiforeachtypei.Suchaformwouldstillallowfor calcula-tionofmaximum-likelihoodestimatesandHessian-based(Wald) confidenceintervals.

Thisminimallysufficientformofthedataalsoillustratesthe robustnessofestimatestosomeimperfectionsofthedata.Thehi,j andIi,jonlyaffectthelikelihoodthroughthesums

jhi,jand

jIi,j. Thussomeerrorinourcalculationofhi,jandIi,jshouldnotbiasour estimatestoomuchaslongastheaverageerrorisclosetozero,and thusgreatcertaintyaboutY(t)isnotnecessary.Forexample,ifwe underestimatedY(t)atsomepointsasaresultofasymptomatic infectivesbeingpresentin realityandwe overestimatedY(t)at someotherpointsasaresultofmisdiagnoses,thoseerrorsmay canceleachothertosomeextent.Also,sometimesthedata con-sistofonlythetimesatwhichpeoplestopbeinginfectious—for example,whenpeopleareisolatedafterbeingidentifiedas infec-tive.Insuchcasestotalexposurecouldstillbeestimatedbyusing akernel-smoothingmethod(LauandYip,2008).

Thelikelihood(2)canbeparameterizeddifferentlyas

ltr(ˇi,ri)=(ri/ˇi)!/(ri/ˇi−ki)!exp[i(ˇiki−ri)]

× ki

j=1

ˇiYi,jexp(−ˇihi,j), (3)

whereri=ˇiXi(0) istheinitialper-infectiveincidencerate.Inour application,wechoosetoestimateriinsteadofXi(0)becauseriis easiertointerpretinthecontextofourdata.Forbrevity,werefer toriastheinitialgrowthrate.

Thefulllikelihoodfunctionthatweuseforann-outbreakdata setisthen

l(ˇ,r,,)=

i

ltr(ˇi,ri)lrec(i,), (4)

whereweuseboldfacetodenotevectorswithelementsequalto theparametersforeachtypei.

Tomakeuseofpreviousresultsfromstatisticaltheoryaswell astouseconventionallanguagewhenwritingaboutourmodel,we shallnextpresentourmodelasageneralizedlinearmodel(GLM). GLMsareabroadclassofstatisticalmodelsthatincludesmany commonlyusedregressionmodels.AGLMconsistsofthree com-ponents:(i)adensityfunctionfromtheexponentialfamily,(ii)a linearmodelthatmapspredictivevariablestoapredictor,and(iii) alinkfunctionthatmapsthepredictortothemeanofthedensity function.

Ourlikelihoodfunctions,(1)and(3),fitthedefinitionof expo-nentialfamilydensities.Thatisnottosaythatthetransmission andrecoverytimesfromasmalloutbreakareindependentrandom variableswiththosedensities.Infact,theymaybehighlycorrelated (Rida,1991).ButthesituationisanalogoustothatofGLMsfor longi-tudinaldata,whereignoringwithin-subjectcorrelationsincreases thevarianceofestimatesbutstillleadstoaccurateestimatesinthe limitofdatafromalargenumberofindependentsubjects(Liang andZeger,1986).

Weobtainalinearmodelbyassociatingeachtypeofpersonin themodelwithasetofpredictivevariables.Intheapplicationto noroviruswedescribehere,suchpredictivevariablesare,for exam-ple,thetypeoffacilityinwhichanoutbreakoccurred(e.g.,hospital orlong-term-carefacility).Wecombinethesepredictivevariables intoadesignmatrixZ,whichhasarowforeachtypeianda col-umnforeachpredictivevariable.Thelinearmappingfrommultiple predictivevariablestoalinearpredictorisachievedbymultiplying thedesignmatrixwithavectorofregressionparametersc.

Aslink function,we chosethenatural log, which tendedto performbetterthanotherpotentiallinkfunctionsinour appli-cation. For example, for transmission-rate estimates ˇi, we let logˇi=Zi,*cˇ,whereZi,* isrowiofthedesign matrixandcˇare ourregressionparametersforthetransmissionrates.

The conditions for consistency and asymptoticnormality of parameterestimatesforGLMshavebeengivenbyFahrmeir(1985). Foroutbreak data,thevaluesofpredictivevariablesin thedata arelikelytobesomewhatrandomlydetermined,inwhichcasethe conditionsgivenbyDingandChen(2006a,b)apply.

Onestandardconditionforconsistencyisthatthetruevalueof theparameterdoesnotlieontheboundaryofparameterspace. Thatconditionwouldseemtobeviolatedfordatasetsinwhichthe numberofcasesinanoutbreakkiisequaltothenumberof suscep-tiblesX(0)

i becausethetransmission-timepartofthelikelihood,(2), isdefinedonlywhenX(0)

i ≥ki.However,giventhatweapproximate thediscretequantityXi(0)withacontinuousoneforthepurpose offittingthemodel,itseemsreasonabletoconsiderki−0.5asthe lowerboundofX(0)

i andtosaythat

ltr(ˇi,Xi(0))=Xi(0)!/0!exp(0)× ki

j=1 ˇiYijexp(−ˇihi,j), whenX(0)

i isin(ki−0.5,ki].ThenthetruevalueofXi(0)isguaranteed nottobeontheboundaryandstandardconsistencyresultsapply.

IntheAppendix,weprovideanalternativeproofofconsistencyfor

ourmodel inthesimplecasethat alloutbreakssharethesame parameters.

Evidencethatthemodelperforms wellin realisticsituations appears in the Results section. We were able to recover from

(4)

simulateddatatheparametersforthenon-trivialmodelthatwe fittedinourapplication.

Weestimatedthetransmissionrateandinitialgrowthrateby maximizingthetransmission-timefactorsin(4)giventheoutbreak data,usingtheNewton–RaphsonmethodasimplementedintheAD ModelBuilder(Fournieretal.,2011).TokeeptheNewton–Raphson searchformaximum-likelihoodestimatesinthefeasible parame-terspace,weaddedapenaltytotheloglikelihoodwheneverthe impliedfinalnumberofsusceptiblesx=X(0)

i −kiforanoutbreak wastooclosetozero,x<

.ThepenaltywasoftheformC(x−

)2,

whereCisanarbitrarynumericconstantwhichwesettoC=0.01. Likewise,wheneverx<

,wereplacedxby

/(2−x/

).Throughout thiswork,weused

=0.001.

Weestimatedthemeaninfectiousperiodandthedispersionof theinfectiousperiodbyusingtheglmfunctioninR(RDevelopment CoreTeam,2010).Bydefault,thedispersionparameterforgamma GLMsisestimatedviathemomentsestimatefor thecoefficient of variation, perhaps because estimates based on the residual deviancearesensitivetosmallvaluesin thedata(Venable and Ripley,2002,p.9).Consistently,wetypicallyfounddefault esti-matestobemoreaccuratethanMLestimateswhenfittingsmall, simulateddatasetsandwefoundthemtobeverysimilartoML esti-mateswhenfittinglargedatasets.Tobeconsistentinourtreatment ofbothsmallandlargedatasets,weusedthedefaultestimateof thedispersionparameterthroughoutthiswork.

Data

The norovirus (NoV) data we analyze here originated in a prospectivesurveillanceprograminhospitalsandlong-term-care facilities in England (Lopman et al., 2004a,b).We analyzedthe dynamicsof75outbreakslaboratory-confirmedtobecausedby NoV in which a total of 1523cases of gastroenteritis occurred amongpatientsandstaff.Weselectedthesedatafromthelarger datasetproducedbythesurveillanceprogramasfollows.

Mostrecordsofinfectionsthatwereattributedinwholeorin parttonorovirusincludedthedatesofboththeonsetofandthe recoveryfromsymptoms.However,inmanyrecordsboth dates weremissing,andinmostoutbreakssomerecordslackedatleast onedate.

Wediscardedallrecordsfromoutbreaksinwhichmorethan55% ofthedatesofrecoveryweremissing.Intheremainingoutbreaks, we replaced missing dates of recoverywith thecorresponding onsetdateplus themediansymptomaticperiod fromcomplete recordsinthatoutbreak.Thesereplacementsweredoneasa prepa-rationfortheestimationofthetransmissionratesandwerenot includedwhenestimatingsymptomaticperiods.

Wediscardedallrecordswheretheonsetdatewasmissing.This practiceisunlikelytointroducealargebiasaslongasarelatively smallnumberofonsetdatesarediscarded.Wemadesurethatthis numberwasrelativelysmallbyusingdataonlyfromoutbreaksin whichthenumberofrecordsthatweremissingonsetdateswas lessthan7%ofthenumberofrecordsthatwerenotmissingonset dates.

Wemadeseveralsimplifyingassumptions.Weassumeda per-sonisinfectiveonlywhensymptomatic, whichissupportedby Sukhrie et al. (2012). We further assumed that staff move to theinfective-but-removed state after one dayof symptoms, in accordancewithaninfectioncontrolpolicy.Ofcourse,staffwith norovirussymptomsarelikelytostopworkingsoonerthanthatin manycases.Butitseemslikelythatallinfectivestaffhavesome smallprobabilityoftransmittingthediseasetoothersbeforethey leave.Zelneretal.(2013)foundthathouseholdoutbreakdata sup-ported a modelin which infectiousness is highest at theonset of symptoms,which is oftenwhen vomitingoccurs. Aone-day

Table1

Distributionofnorovirusoutbreaksoverthelevelsofthepredictivevariablesinour regressionmodel.

Facility Season Sizeclass #outbreaks

LTCF Fall–winter Small 1

Large 3

Spring–summer Small 4

Large 4

Hospital Fall–winter Small 18

Large 23

Spring–summer Small 11

Large 11

infectiousperiodisasimplewayofmodelingtheeffectofpeople whomay,infact,onlybepresentforshorterperiodswithhigher infectiousness.

Wealsoassumedthatthelatentperiodisfixedat24h,which fallswellwithinthereportedrangeof12–48h(CentersforDisease ControlandPrevention,2011).Tokeepthetransmission-timepart ofthelikelihoodpositive,weassumedasmall,background haz-ardofinfection(10−8thatofaninfective)triggeredillnessincases

whennoinfectiveswerepresent.Wealsoassumedthatthe num-berofinitialinfectivepeoplewasequaltothenumberofpeople reportingsymptomsonthefirstdayoftheoutbreak.Finally,we assumedthatanychangesinstatehappenatthesametimeeach day.

Predictivevariables

Thepredictivevariablesthatdeterminedourdesignmatrices wereasfollows.Thedatawerecollectedoverthecourseofa one-yearperiodbeginninginApril2002,andwecategorizedthedata intotwogroupsbytheperiodinwhichtheybegan:spring–summer referstooutbreaksthatstartedbetweenApril1andOctober1of thestudyyear;fall–winterreferstooutbreaksthatbeganinthe remainderofthestudyyear.Theperiodvariableallowsforvariation intransmissionrateasaresultofseasonalityofNoV.

Asanadditionalpredictivevariable,weincludewhattypeof facilitytheoutbreakoccurredin,hospitalorlong-term-carefacility (LTCF).

Thethirdpredictivevariableweuseissizeclass.We classify unitsin which thenumberof bedsisless thanor equal tothe mediannumberofbedsassmall.We classifytheotherunitsas large.ThisclassificationwasdoneseparatelyforhospitalandLTCF unitsbecauseLTCFunitsareusuallylargerthanhospitalunits.For thehospitals,thesmallunitshave6–22bedsandthelargeunits have24–33beds.FortheLTCFunits,thesmallunitshave 6–34 bedsandthelargeunitshave36–66beds.Thesizeclassvariable allowsthenumberofinitialsusceptiblestodependonthe approx-imatetotalnumberofpeopleineachunit.Thevariablealsoallows populationsizestoaffectcontactrates.

Thefourthpredictivevariableweuseiscasetype,thetwotypes beingpatientandstaff.Casetypeistheonlypredictivevariablethat variedwithinoutbreaks.

We use a facility–size-class–period–case-type combination with a relatively largeamount of data as the reference group. Specifically,thereferencegroupcomprisesoutbreaksthatoccurred amongpatientsinlargecare-unitsofhospitalsthatbeganbetween October2002and April2003.Theestimatedrateparameterfor thereferencegroupservesasthecoefficientoftheinterceptof thelinearmodel.Estimatesforothercoefficientstheninformus ofhowmovingawayfromthereferencegroupchangesrate esti-mates.Table1containsthedistributionofoutbreaksamongthe levelsofthepredictivevariables.

We do not consider interaction terms in this work. So in summary, Wilkinson and Rogers(1973) notation for thelinear

(5)

predictors for all our parameters is: 1 +isLTCF +isSmall +isSpringSummer +isStaff.

Confidenceintervals

Toobtainconfidenceintervalsfortheestimates,weperformed a parametric bootstrap. Data were simulated according to our outbreakmodelwiththeestimatedparameters.Eachsimulation produced datafroma setof outbreaksequal in size totheset thatwefit,witheachoutbreakinthesimulationmatchingan out-breakinthefitteddatainterms ofinitialnumber ofinfectives, predictivevariables, fractionof caserecordswithmissing onset andrecoverytimes,andfractionofcaseswithmissingrecovery times.Percentileconfidence intervalsfor regression coefficients wereestimatedfrom10,000simulationreplicates.

Simulation

Weusedsimulationtoinvestigatehowthebiasandvariance ofourestimatesdependonthenumberofoutbreaksthattheyare basedonaswellastheamountofmissinginformation.Wealso usedsimulationtogeneratebootstrapconfidenceintervals.

SimulationsbeganwithsomeinitialnumbersX(0)

i oftype-i sus-ceptibles of one or more types. To initiate theoutbreak, some additionalsusceptibleswereaddedandmovedintoalatentstate. Allpeopleenteringthelatentstatemovedtotheinfectivestate afterafixedtimeperiod.Type-ipeopleenteringtheinfectivestate movedontotherecoveredstateafteragamma-distributedtime periodwith meani and dispersionparameter .For types of infectivesrepresentingcaresstaff,movementtothe infective-but-removedstateprecededmovementtotherecoveredstateifand whentimespentintheinfectivestateexceededapredetermined threshold. Every time the number of infectives or susceptibles changed,thetimeofapotentialtransmissioneventwascalculated bydrawingfromanexponentialdistributionwithrateY

iˇiXi, whereYisthenumberof(non-removed)infectivesandˇiisthe transmissionrateforsusceptiblesoftypei.Ifthepotential trans-missionwassoonerthanthenextchangeinY,atypeofsusceptible waschosenwithprobabilityproportionaltoˇiXiandmovedinto thelatentstate.Simulationsstoppedwhenthenumberoflatent, infective,andinfective-but-removedpeoplereachedzero.The out-putofthesimulationswasarecordforeachpersoninfectedgiving transitiontimes.

Oursimulationexperimenthadafullfactorialdesign,withthe numberofoutbreaksnbeing1,10,or100;thefractionof recov-erytimesimputedbeingeitherzeroorapproximatelythehighest suchfractioninourrealdata(0.53);thefractionofrecordsmissing bothonsetandrecoverytimesbeingeitherzeroorapproximately thehighest suchfractioninourrealdata(0.05);andonset and recoverytimesbeingeitherroundedtodaysorexact.Foreach com-binationoffactorlevels,wesimulateddataandattemptedtofitit 10,000times.Thesesimulationshadjustonetype,justoneinitially infectedperson,atransmissionrateˇof0.0037,aninitialgrowth raterof0.2664,alatentperiodof24h,andinfectiousperiodswith ameanof3.32daysandadispersionparameterof0.58,and nothresholdtimebeyondwhichtheinfectivesweremovedintoan infective-but-removedstate.

Thedetailsofthesimulationofmissingdatawereasfollows. First,anoutbreakwassimulatedasusual.Second,thenumberof caserecordstoremovewascalculatedasthelargestintegerless thanthetotalnumberofcaserecordstimestheparametervalue forthefractionofrecordsmissingonsetandrecoverytimes.This numberofrecordswereselectedatrandomandremoved.Then thenumberofcase recordsforwhich toimputerecoverytimes wascalculatedasthelargestintegerlessthanthetotalnumberof

remainingcaserecordstimestheparametervalueforthefractionof recordsimputed.Thisnumberofrecordswereselectedatrandom andgivenimputedrecoverytimes.

Once-dailyobservationoftheoutbreakwassimulatedby round-ingtransitiontimesdowntothenearestwholeday.Outbreakswere startedatrandomtimesinthefirstdaytopreventtherounding fromhavingartificialeffectsonthedatafromsmalloutbreaks.

Ourgradient-basedoptimizationcodeformodelfitting,which workedwellatestimatingtransmissionrateparameterswhenthe numberofoutbreakswaslarge,didnotworkwellwhenthenumber ofoutbreakswassmall.Soweusedspecializedcodetofitthe mod-elsofthesimulationstudy,whichweremoreanalyticallytractable byvirtueofnothavinglinearpredictors.TheAppendixdescribes thebasisforthiscode,whichalwaysfindsthemaximum-likelihood estimateifitexistsandidentifiescasesinwhichnosuchestimate exists.

Software

OuroutbreaksimulationcodemadeuseoftheSimPy(Vignaux etal.,2012)pythonmodule.TheRngStreamsClibrary(L’Ecuyer etal.,2002)allowedforthesimulationstoruninparallel.Weused theADModelBuilder(Fournieretal.,2011)andR2admb(Bolker andSkaug,2011),anR(RDevelopmentCoreTeam,2010)interface forit,tooptimizetheloglikelihood.Wepreparedgraphicswiththe Rpackageggplot2(Wickham,2009).Codecapableofreproducing theresultsisavailablefromtheauthorsonrequest.

Results

Wefirstpresentasimulationstudythatillustratestheaccuracy ofourmethodsonalargenumberofsmalloutbreaksthatwere simulatedfromourmodel.Withthatself-consistencytestpassed, wethenpresentestimatesfromtheapplicationofourmethodto thenorovirusdata.Toshowtheextenttowhichourmodelwas appropriateforthenorovirusdata,wethenpresentdiagnosticsof thefit.

Simulation

Weusedsimulationtoseehowmanyoutbreakdatasetsmay berequiredforestimatestobeapproximatelynormallydistributed aroundthetrueparametervalues.Thesimulationsalsoallowedus togaugetheeffectsoftheimputationandroundingnecessaryfor ourapplication.

Muchpreviousworkhasshownthatestimationwithdatafroma single,smalloutbreakisunreliable(Shao,1999,andrefs.therein). Thusonebenefitofaggregatingdatafrommultipleoutbreaksis thatitallowsfordatafromminoroutbreakstoproducereliable estimates.However,usingdatafromminoroutbreaksdoes repre-sentaworst-casescenariointhesensethateach suchoutbreak contributesonlyasmallamountofinformation.Forthosetwo rea-sons,andtokeepthesimulationstudyatamanageablesize,we restrictedoursimulationstoonesetofparametersthatis guaran-teedtoresultinsmalloutbreaks.Toallowforcomparisonwithour fitstothenorovirusdata,weusedtheparametersestimatedforour baselineregressiongroup.

Asexpected,theestimateswerenotverygoodwhenusingdata fromsingleoutbreaks(Fig.1).Inabout49%ofthesesimulations, theinitialinfectivefailedtoinfectanyone,limitingestimationto thelengthofthesymptomaticperiod.Inabout13%ofthese simu-lations,onlyonetransmissionoccurredandthetransmissionand growthrateparameterswereunidentifiable.Inabout21%ofthese outbreaks,theestimateofrwasonthelowerboundof parame-terspace,preventingcalculationofWaldconfidenceintervals.In theremaining17% ofreplicates,thecoverageprobabilityofthe

(6)

Table2

Simulationresultsfortransmissionrateˇandinitialgrowthrater.ndenotesthenumberofoutbreakssimulatedforanestimate.Imputedreferstothefractionofrecovery timesdeletedandthenimputedasdescribedintheMethodssection.Missingreferstothefractionofcaserecordsdeletedbeforefittingthedata.Roundedindicateswhether theonsetandrecoverytimeswereroundedtowholedays.Inthesimulations,ˇwassetto0.0037transmissionsperinfective-susceptibledayandtherwassetto0.2664 transmissionsperinfectiveday.

n Imputed Missing Rounded Bias( ˆˇ) Av.s.e.( ˆˇ) ˇcover.(%) Bias(ˆr) Av.s.e.(ˆr) rcover.(%)

1 0.00 0.00 0 0.198 0.131 82 0.52 0.834 87 1 0.196 0.129 82 0.52 0.830 88 0.05 0 0.21 0.130 81 0.55 0.84 87 1 0.206 0.127 83 0.53 0.83 90 0.53 0.00 0 0.251 0.132 80 0.67 0.958 81 1 0.234 0.137 80 0.65 0.966 81 0.05 0 0.231 0.149 83 0.63 0.98 84 1 0.214 0.134 81 0.61 0.96 84 10 0.00 0.00 0 0.0154 0.0456 89 0.0389 0.2882 95 1 0.0155 0.0457 90 0.0378 0.2888 95 0.05 0 0.0160 0.0471 90 0.0370 0.2932 96 1 0.0160 0.0470 89 0.0373 0.2919 96 0.53 0.00 0 0.0158 0.0525 93 0.0602 0.3116 91 1 0.0162 0.0533 93 0.0608 0.3134 91 0.05 0 0.0168 0.0558 92 0.0583 0.319 92 1 0.0170 0.0562 93 0.0587 0.319 92 100 0.00 0.00 0 0.00055 0.00571 94 0.0027 0.07380 95 1 0.00053 0.00569 94 0.0025 0.07370 95 0.05 0 0.00066 0.00614 94 −0.0001 0.07533 95 1 0.00067 0.00617 93 −0.0004 0.07535 95 0.53 0.00 0 −0.00033 0.00668 96 0.0195 0.08036 82 1 −0.00034 0.00653 96 0.0131 0.07844 86 0.05 0 −0.00018 0.00724 97 0.0174 0.08222 85 1 −0.00024 0.00707 97 0.0103 0.08018 87

95%Waldconfidence intervalsrangedfrom80to90%(Table2) andthebiasandaveragestandarderrorforthetransmissionrate wasalmost100timesthetruevalueoftheparameter.The aver-agecorrelationbetweenthetransmissionrateandinitialgrowth rateestimates was94%. Estimates for thesymptomatic period, althoughobtainedforallreplicates,werealsonotaccurate(Fig.1 andTable3).

Rounding,deleting 5% ofcase records,and imputing53% of recoverytimesallgenerallyincreasedtheaveragestandarderror ofestimates,witheffectsinthatorder.Effectsonthebiaswere somewhatmorevariable,buttheasymptoticeffectsofthese pro-cedureson the bias appears tobe zero. However, even in the 100-outbreakscenariotheimputationcausedcoverage probabil-ities forr todeviate byas manyas 13 percentage pointsfrom 95%(Table2),whichrecommendstheuseofconfidenceintervals thataccountfortheimputation,suchastheonesweusedinour application.

Onthewhole,theestimatesweremuchmoreaccurateinthe 10-and100-outbreakscenarios(Fig.1,Tables2and3).Theywerealso morerobust.Estimatesforrwereonthelowerbound5%ofthetime inthe10-outbreakscenarioandneveronthelowerboundinthe

100-outbreakscenario.Thelikelihoodwasdivergentabout7–10% ofthetimeinthe10-outbreakscenarioversus0.1–2%ofthetime inthe100-outbreakscenario.Theaveragecorrelationbetweenthe estimatedtransmissionrateandgrowthratewasabout0.83and 0.74forreplicatesinthe10-and100-outbreakscenarios, respec-tively.

Insum,themethodworkswellwithasufficientlylargedata set.Moderateamountsofimputation,missingdata,androunding willhavemostlymodesteffectsonestimates.Simulation,aspart ofaparametricbootstrapprocedure,canprovideanindicationof theaccuracyofestimatesforaparticulardatasetofinterest.We demonstratesuchaprocedureinourapplication.

Estimatesfornorovirusinhealth-caresettings

We fitted our generalized linear model to data from a largeprospectivestudyofgastroenteritisinhealth-caresettings (Lopmanetal.,2004b).Inthisone-yearstudy,patientsandthecare staffassignedtoanyofabout4500bedsinhealth-carefacilities intheformerCountyofAvon,England,wereunderactive surveil-lance.Trainedstaffmembersrecordedthedatesoverwhichpeople

Table3

Simulationresultsforsymptomaticperiodmeananddispersionparameter.ndenotesthenumberofoutbreakssimulatedforanestimate.Missingreferstothefraction ofcaserecordsdeletedbeforefittingthedata.Roundedindicateswhethertheonsetandrecoverytimeswereroundedtowholedays.Cover.referstothecoverageprobability ofWaldconfidenceintervals.Lowerˆandupperˆrefertotheboundsofabootstrapconfidenceinterval.Inthesimulations,wassetto3.32daysandthewassetto0.58. n Missing Rounded Bias(ˆ) Av.s.e.(ˆ) Cover.(%) Bias(ˆ) Lowerˆ Upperˆ

1 0.00 0 −0.60 3.76 83 −0.067 0.01 1.46 1 −0.59 3.75 84 −0.037 0.00 2.00 0.53 0 −0.65 4.13 80 −0.096 0.01 1.47 1 −0.59 4.26 80 −0.027 0.00 2.00 10 0.00 0 −0.104 1.511 91 −0.005 0.31 0.98 1 −0.106 1.534 91 0.008 0.32 0.98 0.53 0 −0.108 2.123 90 −0.016 0.23 1.12 1 −0.106 2.161 90 0.002 0.24 1.15 100 0.00 0 −0.010 0.4703 94 −0.0006 0.49 0.68 1 −0.010 0.4772 95 0.0157 0.50 0.70 0.53 0 −0.007 0.6853 94 −0.0012 0.46 0.73 1 −0.010 0.6928 94 0.0132 0.47 0.76

(7)

Fig.1.Estimatesversusnumberofoutbreaks.Therownamesindicateparameters. Eachsmallblackpointrepresentsanestimate.Thelargergraypointsrepresentthe meansoftheestimates.Thehorizontallinesrepresentthevaluesoftheparameters usedtosimulatethedata.

weresymptomaticandtooksamplesthatallowedforlaboratory confirmationofthecausesofoutbreaks.Fig.2showsthecase his-toriesthatwereusedtofitourmodel.

Thepredictorsinourmodelwerefacilitytype,whichindicated whetheranoutbreaktookplaceinalong-term-carefacility(LTCF) orahospital;sizeclass,whichindicatedthenumberofpatientsand staffintheunit;period,whichindicatedthetimeofthestudyyear whentheoutbreakbegan;andcasetype,whichindicatedwhether acasewasapatientoramemberofthecarestaff.SeetheMethods sectionformoredetails.

Forourbaselineregressiongroupofpatientsinlargehospitalsin thefallandwinter,theestimate(95%bootstrapconfidenceinterval) ofthetransmissionratewas0.0037(0.0026–0.0052)transmissions perinfective-susceptibleday,thatoftheinitialgrowthratewas 0.27(0.23–0.30)transmissionsperinfectiveday,thatofthe symp-tomaticperiodwas3.35(3.09–3.57)days,andthatofthedispersion parameter for the symptomatic period was0.57 (0.54–0.65). Thoseparameterestimateshavebeentransformedfromthelog scaleforeaseofinterpretation.Thefullsetofuntransformed esti-matesisgiveninTable4.

The basic reproduction number R0 is the expected

num-ber of new infections that a single infection will cause at the beginning of an outbreak. Table 5 contains values of R0

calculated from the regression coefficients using the formula R0=(rpatient+rstaff)patient.

0 0 0 10 0 10 −10 0 10 −10 0 10 20 0 10 20 0 10 20 30 0 10 20 30 40 50 60 70 0 0 20 20 2 0 2 0 2 4 0 2 4 0 2 4 6 8 0 2 4 6 8 10 Time (weeks) Case ID n umber

Fig.2. Casehistories.Eachhorizontalbarrepresentsthehistoryofaperson.The symptomaticperiodisfilledin.CaseIDswereassignedbysortingthecasesfirstby onsettime,thenbyrecoverytime,andthenbyarandomordering.Initialinfectives weregivennegativecaseIDs.Thepanelsarearrangedsothattheoutbreaksize increasesfromtoptobottomandtheoutbreaklengthincreasesfromlefttoright. Casehistoriesfromlong-term-carefacilities(LTCFs)areinlightgray.Someofthe timesofrecoveryfromsymptomswereimputedasdescribedintheMethodssection.

Fig. 3 shows estimates for the effect on model parameters ofan outbreakbeingdifferent fromtheoutbreaksin the refer-encegroupwithrespecttooneof ourpredictivevariables.The largesteffectsaretheincreaseintransmissionandgrowthsrates in long-term-care facilities (LTCFs) and the reduction in these ratesinstaff.Itappearsthattransmissionratesarehigherinthe smallerunits.Symptomaticperiodswereestimatedtobeabout 25%shorterforoutbreaksinLTCFsand20%shorterforcasesamong staff.

Diagnostics

Asa general test of modelfit for the transmissionrate and growthratelikelihoods,wecalculatedthepercentileofthelog like-lihoodofthefittotherealdatainthedistributionofloglikelihoods generatedbybootstrapping.Outof10,000bootstrapreplicates,our optimizationcodefoundestimatesin9809cases.Theloglikelihood ofthefittotherealdatawasinthe25thpercentileofthelog like-lihoodsfromtheseestimates.Thus,thelog-likelihoodofourfitto therealdataisnotextreme,consistentwithagoodmodelfit.

Ouruseofthemomentsestimatorforthedispersion param-eterin(1)precludedasimilarassessmentofmodelfitforthe Table4

Regressioncoefficients(95%bootstrapconfidenceinterval)forthenaturallogoftheoutbreak-modelparameters.

Transmissionrate Initialgrowthrate Symptomaticperiod

(Intercept) −5.60 (−5.96,−5.25) −1.31 (−1.45,−1.20) 1.21 (1.13,1.27) LTCF 1.98 (1.61,2.35) 1.06 (0.87,1.20) −0.28 (−0.37,−0.18) Small 0.45 (0.34,0.70) −0.03 (−0.15,0.11) 0.08 (0.00,0.16) Spring–summer −0.06 (−0.29,0.14) 0.04 (−0.09,0.17) 0.00 (−0.08,0.09) Staff −1.03 (−1.43,−0.74) −0.87 (−1.00,0.73) −0.23 (−0.31,−0.14) Dispersion 0.57 (0.54,0.65) Num.obs. 1523 1412 Loglikelihood -1625 -2896

(8)

Table5

Predicted R0 values (95% bootstrap confidence interval) based on regression

coefficients.

Facility Season Sizeclass R0

LTCF Fall–winter Small 2.78 (2.55,3.23) Large 2.62 (2.40,2.98) Spring–summer Small 2.90 (2.69,3.29) Large 2.74 (2.53,3.04) Hospital Fall–winter Small 1.27 (1.22,1.43) Large 1.20 (1.14,1.33) Spring–summer Small 1.33 (1.26,1.53) Large 1.25 (1.19,1.41)

symptomaticperiods.However,inspectionofthedefaultdiagnostic plotsforglmobjectsinRdidnotindicateanyproblems.

Theestimatedvalueofthelogoftheinitialnumbersof suscep-tibles(calculatedaslog ˆX(0)

i =log( ˆri/ˇˆi))providedameansfora sanitycheckofourestimates.AsdescribedintheMethodssection, oursize-classpredictivevariablewasdeterminedfromthe num-berofbedsorstaffassignedtoaunit.Wechosenottodirectlyuse thenumbersofbedsorstaffastheinitialnumberofsusceptiblesin ourmodelbecausetheyarelikelynoisymeasurementsofthetrue value.However,itisworthnoticinginFig.4thatourestimates areonapproximatelythesamescaleasthenumbersofbedsand staff.Furthermore,ourestimatesreplicatetwoqualitative differ-encesseeninthebed-numberandstaff-sizedata.First,unitsinthe smallsizeclassdoindeedhavesmallerlog ˆX(0)

i thandounitsin thelargesizeclass.Second,log ˆX(0)

i forstaffarelargerthanthose forpatients.Ontheotherhand,log ˆX(0)

i islargerforhospitalsthan forLTCFswhileLTCFshavemorebedsandlargerstaffs.This differ-encemightbereducedbyincreasingthecomplexityofourmodel, butsuchanexerciseinmodelselectionisbeyondthescopeofthis paper.

LTCF Small summerSpring− Staff

0.25 0.5 1 2 4 8 16 0.25 0.5 1 2 4 8 16 0.25 0.5 1 2 4 8 16 Tr ansmission rate Initial gr o wth rate Symptomatic pe riod 0 2000 0 2000 0 2000 0 2000 Count Multiplicati v e ef fect

Fig.3.Regressioneffectestimates.Thecolumnnamesindicatelevelsofthe cate-goricalpredictivevariablesbywhichoutbreaksdifferedfromthereferencegroup. Thereferencegroupwasmadeupofallcasesofnorovirusamongpatientsinlarge hospitalsinthefallandwinter.Therownamesindicateparameters.Thehistograms displaythedistributionsofestimatesobtainedinaparametricbootstrap.Gray rec-tanglesindicatea95%confidenceintervalbasedonthepercentilesofbootstrap estimates.ThewhitehorizontallineinsideeachrectangleindicatestheMLestimate. LTCFstandsforlong-term-carefacility.

Small Large 10 100 10 100 LTCF Hospita l

Patient Staff Patient Staff Case type

Initial susceptib

les

Fig.4. Estimatedinitialsusceptiblesandmeasuresofcare-unitsize.Thecolumn names,rownames,andx-axismarksindicatethelevelsofthepredictivevariables thatdeterminedtheestimateofinitialsusceptiblesrepresentedbythegraybars. Themiddle95%ofestimatesfromtheparametricbootstrapwereusedtodetermine theupperandlowerboundsofthegraybars.Eachpointcorrespondstoanobserved outbreakofNoVandindicateseitherthenumberofbedsforpatientsinthecare unitorthenumberofstaffassignedtoaunit.LTCFstandsforlong-term-carefacility. Estimatesofinitialsusceptiblesdidnotvarymuchbytheperiodpredictivevariable, soestimatesanddatafrombothperiodsaregroupedtogetherinthisplot.

Tographicallyevaluatethefitofourmodel,weplottedobserved values of cumulative exposure before infection, symptomatic periodduration,andcumulativeincidenceagainstthoseexpected fromthefittedmodel(Figs.5,6,and7).

Fig.5.Quantile–quantileplotsofthecumulativeexposureofpeopleatthetime ofinfection.Thesemi-transparentlinesconnectdatapointsfromthesame out-break.Theopaquediagonallineisareferencelinethatindicateswherepointsfrom perfectlymatchingdistributionswouldfall.

(9)

Fig.6.Quantile–quantileplotsofthesymptomaticperiods.Thesemi-transparent linesconnectdatapointsfromthesameoutbreak.Theopaquediagonallineisa referencelinethatindicateswherepointsfromperfectlymatchingdistributions wouldfall.

FromFig.5,itseemsthatthemodelfitsbestforintermediate valuesofcumulativeexposureintherangeof50–200 infective-persondays.Inthisrange,thelinesconnectingdatapointsfrom individualoutbreaksareroughlyparalleltoandcenteredaboutthe referenceline,whichshowsthatinthisrangetheobserved distri-butionsaresimilarinshapetotheexpecteddistributionsbutwith probabilitydensitiesshiftedtotheleftorright.Forlowvaluesof cumulativeexposure,theobserveddistributionshavemuchhigher densitiesthanthefittedmodelwouldpredict.Thisdisagreement

Fig.7.Observedversusexpectedcumulativecasecounts.Thesemi-transparent linesconnectdatapointsfromthesameoutbreak.Theopaquediagonallineisa referencelinethatindicateswherepointsfromperfectlymatchingfunctionswould fall.

isvisibleasthehighdensityofpointsbelowthereferenceline. Forhighvaluesofcumulativeexposure,theobserveddistributions seemtobetruncatedataround220infective-persondays,whereas thefittedmodelassumesanexponentialdistributionthatpredicts muchlargervalues.

InFig.6,observedsymptomaticperiodsaregenerallycloseto predictedsymptomaticperiods.However,thereseemstobea ten-dencyforsymptomaticperiodsinsomeoutbreakstobelessvaried thanthemodelpredicts.Thistendencyisvisibleaslinesthatstart abovethereferencelineontheleftsideoftheplotandthenbend totherightquicklyenoughtobeunderthereferencelinebythe timetheyend.

InFig.7,cumulativeincidenceishigherthanexpectedearlyin outbreaksasaconsequenceofthelarger-than-predictednumberof caseswithsmallexposure(Fig.5).Thedifferencebetweenobserved andexpectedcumulativeincidencetendstobecomelesspositive orslightlynegativebytheendofoutbreaks,andtheabsolute dif-ferenceisusuallylessthan10throughout.

Discussion

We haveshown thatreliable estimationof parametersfrom manysmalloutbreaksispossibleusingageneralizedlinearmodel basedonstandardstochasticepidemicmodels.Asimulationstudy demonstratedthatweareabletoaccuratelyestimateparameters whenthedatastemfromsmalloutbreaksevenwhensomedataare missingandabouthalfofrecoverytimesareimputed.Fittingthe modeltoalargenumberofoutbreaksofnorovirus,wefoundthat facilitytype,facilitysize,andcasetypeseemtohavesignificant effectsonoutbreakdynamics.

Höhle(2009)describedahighlygeneralformulationof stochas-ticepidemicmodelswithinaregressionframework,andourmodel isalmostaspecialcaseofthatgeneralformulation.However,that formulationdidnotinclude regressionsforthemeaninfectious periodortheinitialgrowthrate,whichweincludehere.Modeling detailsaside,ourworkherediffersfromHöhle(2009)andrelated regressionapproaches (Forresterand Pettitt,2005; Voirinetal., 2011;Meyeretal.,2012)indemonstratingtheparticularvalueof amultiple-outbreakregressionwhenfittingdatafromsmall out-breaks,whicharequitecommoninhealth-caresettings.

Themost striking resultof ourregression estimates(Fig.3) aretheapproximately7-foldincreaseintransmissionratesand 3-foldincreaseininitialgrowthratesinthelong-term-carefacilities (LTCFs)relativetohospitals.Fig.2showsthatLTCFoutbreaksdo indeedincludemanyofthelargerandfastergrowingoutbreaksin thedataset.

ThehighertransmissionratesforoccupantsofLTCFsmaybe aconsequenceofoccupantshavingmoreopportunitytosocialize inlargegroups.Alternatively,wemaybeseeingtheeffectsofour assumptionsofaclosedpopulationandhomogeneousmixingbeing violated.Hospitalshavemorerapidturnoverofpatients,andthe exposureofpeoplewhoarrivedinthecareunitaftertheoutbreak startedwillbeoverestimatedinourmodel.OccupantsofLTCFsmay varymoreincontactratesbyvirtueofbehavioraldifferences,and suchvariationinexposurecouldleadtoahigherinitialgrowthrate (Becker,1989,pp.133–138).

Our model is agnostic about the particular pathways of transmission in the outbreaks, and thus the estimates repre-sent rates of transmission by all routes including transmission byperson-to-person, environmentally-mediated, and foodborne routes.Outbreakinvestigationsoftenassociatefoodborne trans-missionwithanabruptincreaseininfections(e.g.,Isakbaevaetal., 2005),andthusdifferenceinthefrequencyofthismodeof trans-missionbetweensettingscouldcausedifferencesinourestimates. Thatbeingsaid,wedidnotincludeinfectiontermscorresponding

(10)

tofoodbornetransmissionbecausemostreportednorovirus out-breaksareattributedprimarilytoperson-to-persontransmission (Yenetal.,2011),andwebelievedthatalltheoutbreakswe ana-lyzedinvolvedprimarilyacombinationofperson-to-personand environmentally-mediatedtransmission.Incrowdedsettingssuch ashealthcare-facilities,itisoftendifficulttodeterminethe rela-tivefrequencyoftheseroutesoftransmission(Kuusietal.,2002; Isakbaevaetal.,2005;CentersforDiseaseControlandPrevention, 2008).

Althoughouraimwasnottofitthedatatoahighlyrealistic model,wedidtakeasignificantsteptowardsrealismbymodeling variationintheinitialgrowthrateoftheoutbreak.The transmis-sionrateinourmodeldetermineshowtheexpectednumberof newcasesincreasesastheproductofthenumbersofinfectives andsusceptiblesincreases.Thusestimatesofthetransmissionrate willbehighlysensitivetothoseoftheinitialnumberof suscepti-bles,whichdeterminesthenumberofsusceptiblesthroughoutthe outbreak.Butinthecaseofnorovirus,thenumberofsusceptibles isdifficulttoknowasthereisnoserologicalcorrelateof protec-tion.Recentworkonjointestimationof transmissionrates and theinitialnumberofsusceptibleswithdatafromasingleoutbreak (Hayakawaetal.,2003;Hugginsetal.,2004;LauandYip,2008; Kypraios,2009)hasshownthatestimatesoftheinitialnumberof susceptiblestendtobelowwhendatasetsaresmall.Aswehave seeninoursimulationresults(Fig.1),thisbiasdecreasesasthe numberofoutbreaksinthedatasetincreases,evenifalloutbreaks aresmall.

Theestimates for NoVtransmission dynamicswe calculated complementresultsfrompreviousepidemiologicalanalysesofNoV inhealth-caresettings.Previousanalysesofourdataset(Lopman etal.,2004a,b)hadexaminedhowriskofNoVinfectionor particu-larsymptomsofNoVinfectionvariedwithageandotherpersonal characteristics.Thecurrentanalysisaddstotheseresultsby quan-tifyingeffects that couldbeusedtopredictnorovirus outbreak dynamics.

Analysisofa2003–2006studyofNoVoutbreaksin long-term-carefacilities(LTCFs)inOregon(Rosenthaletal.,2011)suggested thatlargerfacilitiesmayhaveahigherriskofexperiencing out-breaks. Our result that transmission rates are lower in larger facilitiessuggeststhatanyincreasedriskthatlargerfacilitieshave isnotcausedbyincreasedtransmissionrates.However,ourresult mustbeinterpretedwithcautionbecausewehavenotbeenable toaccountformanyfactorsthatmayaffectcontactrates,suchas numberofbedsperroom.Whetherthecauseofthelower trans-missionratesisreallyfacilitysizeoracorrelationoffacilitysize withsomeomittedvariableisunclear.

Afewpreviousstudieshaveestimatedindividual-level param-etersfor NoV thatare comparabletoourestimates. Usingdata froma NoV outbreak in a primary schooland nursery in Der-byshire, England, O’Neill and Marks(2005) estimated that the probability of a susceptible person avoiding infection from an infectivepersonintheschoolforadaywas0.998.Usingthe for-mulaPr(avoidance)=exp(−ˇ×1susceptible×1infective×1day), ourestimatesyieldPr(avoidance)thatrangesfromabout0.959for patientsinsmallLTCFsto0.999forstaffinlargehospitals.

Heijneetal.(2009)estimatedthebasicreproductionnumberof NoVinboy-scoutcampstobeabout14and7,respectively,under twodifferentsets ofassumptions. Our highestR0 was

approxi-matelyequalto3(Table5).TherelativelownessofourR0smight

reflectcontactratesbeinghigherinthecampsetting,anditmay alsoreflect theeffectof betterhygiene in thehealth-care sett-ings. Heijne etal. (2009)estimatedthat theimplementationof an enhanced hygiene protocol drove the reproduction number in the camps down toabout 2 and 1, values on par with our ownestimates.Likewise,Heijneetal.(2012)estimated reproduc-tionnumbersforanoutbreakwithinthewardsofapsychiatric

institutiontobeclosetooneandthusclosetoourhospitalR0s.

Actually,thoseestimatestendtobelowerthanours,buttheyare reproductionnumbersaveragedoverthefullcourseofanoutbreak andthusshouldexpectedtobelowerduetothedepletionof sus-ceptiblesovertime.Ourestimatesmaybemoregeneralizablethan theboththepsychiatric-wardandboy-scoutestimatesbecauseour datasetwaslargerandincludeddatafrombothlargeandsmall outbreaks.

Zelneretal.(2010)useddatafromaStockholmoutbreakto esti-matethattheaverageinfectiousperiodwas1.2days.Thesettingof theseoutbreakswashouseholdsthatincludedchildrenindaycare centers.Thus,theinfectiousperiodmayhavebeenshorterinthese outbreaksbecausemanyoftheinfectiveswerelikelyhealthy peo-plebetweentheagesof5and70,whereaspeoplebelowtheageof5 and,toanevengreaterdegree,peopleovertheageof70were over-representedinourdata(Lopmanetal.,2004b).Inourdata,people intheseextremeagegroupshadaveragesymptomaticperiodsof 3days(Lopmanetal.,2004b).Moreover,theStockholmestimate isbasedonimputedinfectiousperiodsratherthansymptomatic periods,whichwerenotreported.Asaresult,iftheassumedinitial numberofsusceptiblesfortheStockholmanalysiswastoohigh, theinfectiousperiodwouldhavebeenunderestimated.

Althoughourestimatesofthesymptomaticperiodmaybe rela-tivelylong,itispossiblethatsomeofthepatientsweredischarged intothecommunitybeforetheybecameasymptomatic.Thus,for patients,ourestimatesmostaccuratelydescribetheperiodofbeing symptomaticwhilesimultaneouslybeinginahealth-carefacility.

ThedailytransmissionratesestimatedfromtheStockholmdata, 0.14transmissionsperinfective-susceptibleday,aremorethan 3-foldhigherthanourhighestestimatedtransmissionrate,which was0.04transmissionsperinfective-susceptibledayforpatients insmallLTCFs.Thejointestimationapproach weusedcouldbe appliedtotheStockholmdatatodeterminewhetherthehigher transmission-rateestimatesmayhaveresultedfrom underestima-tionofhouseholdsizes.

However,thetransmissionratesmaywellbedifferentbecause ofdifferencesinhygienemeasures,contactrates,ordifferent lev-elsofbaselineimmunity.Nursesforexamplemaybefrequently exposed andthereforehighlyimmune. Additionally,time-series analysisofoutbreakincidence(Lopmanetal.,2009)hassuggested thattransmissionratesgenerallymayvarywithhost,weather,and virusfactors.Takentogether,thesedifferencesmayexplain the largediscrepancyinestimatedtransmissionrates.Thecompilation andanalysisofalarge,multiple-outbreakdatasetthat includes predictivevariablesindicativeofhygiene,contactrates,and base-lineimmunitycouldshedlightonwhichoftheseelementshasthe greatesteffectontransmissionrates.

Inourapplication,wemadethesimplifyingassumptionthatthe latentperiodwasfixedatitsmean,whichallowedustodirectly calculateinfectiontimesfromthereportedonset ofsymptoms. Theinfectiontimesdeterminethecumulativeexposureshi,jin(3). Becausethecumulativeexposureisanon-linearfunctionoftime andthemeanofanon-linearfunctionofarandomvariabledoes notalwaysequalthefunctionevaluatedattherandomvariable’s mean,theextenttowhich latentperiodsvariedinrealitylikely introducedbiasintoourcalculatedcumulativeexposuresandthe estimatesbasedonthem.Thebiascouldbeeitherpositiveor nega-tivedependingonwhethercumulativeexposureusuallyincreases morequicklybeforeoraftertheassumedtransmissiontimes.

Another simplifying assumption we made was that people wereonlyinfectiouswhentheyweresymptomatic.Insupportof this assumption,Sukhrieet al.(2012) haveshown that asymp-tomaticpeoplearemuchlessinfectiousthansymptomaticpeople in health-care settings. To quantify the effect of this and the fixed latent period assumption, we could make the infectious period a latent variablethat we integrate over toevaluate the

(11)

likelihood,asinHohleetal.(2005).Alternatively,wecoulduse a kernel-smoothing method to estimate unobserved latent or infectiousperiods,asinLauandYip(2008).However,even with-outsuchcalculationsit isclearthat if,in reality,theinfectious periodextendsbeyondthesymptomaticperiod,ourestimatesof transmissionrateshavebeeninflatedbyourunderestimationof exposure.

Fromthenumerical resultsdisplayed inTables 2 and 3, we seethat highly reliableestimation depends on collectionof an extensivedataset.TheCenters forDiseaseControland Preven-tion(CDC)hasrecentlyestablishedaNationalOutbreakReporting Systemthat,withthecontributionsofstatehealthdepartments, will provide more comprehensive surveillance for all U.S. gas-troenteritisoutbreaks(CentersforDiseaseControlandPrevention, 2011).However,thedatawehaveanalyzedhereismoredetailed thanwhatisroutinelycollectedinoutbreakinvestigations.More detailedoutbreakinvestigationsareneededtocollectsuchdata and further characterize modes of transmission. The collection ofNoV genomic datamayalsobe ofgreat value (Teuniset al., 2013).

Regardingthegeneralsubjectofoutbreaksofhospital-acquired infectionsinindividualhospitals,wesuspectthatinsome hospi-talsalargepartofthenecessarydatacollectionisalreadytaking placeasapartofexistingsurveillanceprograms.A2008–2009 sur-veyestimatedthatapproximatelyonethirdofCaliforniahospitals used automated surveillance technology to monitor hospital-acquired infections (Halpin et al., 2011). Similarly, a survey of hospitals in the Northeastern United States found that one thirdofhospitalshadanelectronicsurveillancesysteminplace (Grotaetal.,2010).Suchsystemswereusedtodetectoutbreaks, analyze data, and generate reports of hospital-acquired infec-tions(Grota et al., 2010). As several states mandate reporting infectionrates of MRSA and many more require someform of reportingof hospital-acquiredinfections (CommitteetoReduce Infection Deaths,2011), many hospitals may have data onthe totalnumberofcasesinmanyoutbreaksforseveralpathogensof concern.

Our analysis,based ona robustdata set,demonstrates that parameterestimates are substantially less biasedwhen a large numberofoutbreaksarefitted.Wesubmitthat,fornorovirusand manyotherpathogens,thereareseveralusesforaccurateestimates oftransmissionrates,initialgrowthrates,andinfectiousperiods. Policy-makerscanusesuchestimatestocomparetheefficacyof differentcontrol strategies suchas hygiene protocols, isolation measures,prophylactictreatments,andvaccinationpolicies.Those monitoringthesmalloutbreaksofzoonoticdiseasesmaybeable tousesuchestimatestoidentifyvariablesthatmaketransmission morelikely.

Roleofthefundingsource

ThisworkwassupportedbyNSFGrant EF-0742373and NIH GrantR01GM088344.E.B.O.wasalsosupportedbyBEACONGrant 61-2075UT.K.M.P. wasalso supportedbytheRAPIDD program of the Scienceand Technology Directorate,U.S. Department of HomelandSecurity,andtheFogartyInternationalCenter,NIH.Our fundingsourceshadnoroleinthisstudy’sdesign;inthecollection, analysis,and interpretationoftheanalyzeddata;inthewriting ofthispaper;orinthedecisiontosubmitthispaperfor publica-tion.

Acknowledgments

TheTexasAdvancedComputingCenteratUTprovided comput-ingresources.

AppendixA. Supplementarydata

Supplementary data associated with this article can be found,intheonlineversion,athttp://dx.doi.org/10.1016/j.epidem. 2013.12.002.

References

Anderson,R.M.,May,R.M.,1992.InfectiousDiseasesofHumans:Dynamicsand Control.OxfordUniversityPress,USA.

Becker,N.,1979.Anestimationprocedureforhouseholddiseasedata.Biometrika 66,271–277.

Becker,N.G.,1989.AnalysisofInfectiousDiseaseData.ChapmanandHall/CRC. Becker,N.G.,1991.Analysisofinfectiousdiseasedatafromasampleofhouseholds.

LectureNotes-MonographSeries18,27–40.

Bolker,B.,Skaug,H.,2011.R2admb:ADMBtoRinterfacefunctions.Rpackageversion 0.7.5.1.

Cannon,J.L.,Lindesmith,L.C.,Donaldson,E.F.,Saxe,L.,Baric,R.S.,Vinjé,J.,2009.Herd immunitytoGII.4norovirusesissupportedbyoutbreakpatientsera.Journalof Virology83,5363–5374.

CentersforDiseaseControlandPrevention,2008.Norovirusoutbreakinan elemen-taryschool-DistrictofColumbia,February2007.MMWR56,1340–1343. CentersforDiseaseControlandPrevention,2011.Updatednorovirusoutbreak

man-agementanddiseasepreventionguidelines.MMWR60,1–18.

Committee to Reduce Infection Deaths, 2011. State laws on reporting HAIs. http://www.hospitalinfection.org/legislation.shtml

Ding,J.L.,Chen,X.R.,2006a.Asymptoticpropertiesofthemaximumlikelihood esti-mateingeneralizedlinearmodelswithstochasticregressors.ActaMathematica Sinica,EnglishSeries22,1679–1686.

Ding,J.L.,Chen,X.R.,2006b.Large-sampletheoryforgeneralizedlinearmodelswith non-naturallinkandrandomvariates.ActaMathematicaeApplicataeSinica, EnglishSeries22,115–126.

Evans,M.R.,Meldrum,R.,Lane,W.,Gardner,D.,Ribeiro,C.D.,Gallimore,C.I., West-moreland,D.,2002.Anoutbreakofviralgastroenteritisfollowingenvironmental contaminationataconcerthall.EpidemiologyandInfection129,355–360. Fahrmeir,L.,1985.Consistencyandasymptoticnormalityofthemaximum

like-lihoodestimatoringeneralizedlinear models.TheAnnalsofStatistics13, 342–368.

Forrester,M.,Pettitt,A.N.,2005.Useofstochasticepidemicmodelingto quan-tifytransmissionratesofcolonizationwithMethicillin-ResistantStaphylococcus aureusinanintensivecareunit.InfectionControlandHospitalEpidemiology26, 598–606.

Fournier,D.A.,Skaug,H.J.,Ancheta,J.,Ianelli,J.,Magnusson,A.,Maunder,M.N., Nielsen,A.,Sibert,J.,2011.ADModelBuilder:usingautomaticdifferentiationfor statisticalinferenceofhighlyparameterizedcomplexnonlinearmodels. Opti-mizationMethodsandSoftware27,233–249.

Glass,R.I.,Parashar,U.D.,Estes,M.K.,2009.Norovirusgastroenteritis.NewEngland JournalofMedicine361,1776–1785.

Grota,P.G.,Stone,P.W.,Jordan,S.,Pogorzelska,M.,Larson,E.,2010.Electronic surveillancesystemsininfectionprevention:organizationalsupport,program characteristics,andusersatisfaction.AmericanJournalofInfectionControl38, 509–514.

Halloran,M.E.,Longini,I.M.,Struchiner,C.J.,2009.DesignandAnalysisofVaccine Studies.Springer.

Halpin,H.,Shortell,S.M.,Milstein,A.,Vanneman,M.,2011.Hospitaladoptionof auto-matedsurveillancetechnologyandtheimplementationofinfectionprevention andcontrolprograms.AmericanJournalofInfectionControl39,270–276. Hayakawa,Y.,O’Neill,P.D.,Upton,D.,Yip,P.S.,2003.Bayesianinferencefora

stochas-ticepidemicmodelwithuncertainnumbersofsusceptiblesofseveraltypes. AustralianandNewZealandJournalofStatistics45,491–502.

Heijne,J.C.M.,Rondy,M.,Verhoef,L.,Wallinga,J.,Kretzschmar,M.,Low,N., Koop-mans,M.,Teunis,P.F.M.,2012.Quantifyingtransmissionofnorovirusduringan outbreak.Epidemiology23,277–284.

Heijne,J.C.M.,Teunis,P.,Morroy,G.,Wijkmans,C.,Oostveen,S.,Duizer,E., Kretz-schmar,M.,Wallinga, J.,2009.Enhancedhygiene measuresandnorovirus transmissionduringanoutbreak.EmergingInfectiousDiseases15,24–30. Höhle,M.,2009.Additive-multiplicativeregressionmodelsforspatio-temporal

epi-demics.BiometricalJournal51,961–978.

Hohle,M.,Jorgensen,E.,O’Neill,P.D.,2005.Inferenceindiseasetransmission exper-imentsbyusingstochasticepidemicmodels.JournaloftheRoyalStatistical Society:SeriesC(AppliedStatistics)54,349–366.

Huggins,R.M.,Yip,P.S.F.,Lau,E.H.Y.,2004.Anoteontheestimationoftheinitial numberofsusceptibleindividualsinthegeneralepidemicmodel.Statistics& ProbabilityLetters67,321–330.

Isakbaeva,E.T.,Widdowson,M.A.,Beard,R.S.,Bulens,S.N.,Mullins,J.,Monroe,S.S., Bresee,J.,Sassano,P.,Cramer,E.H.,Glass,R.I.,2005.Norovirustransmissionon cruiseship.EmergingInfectiousDiseases11,154–158.

Kalbfleisch,J.D.,Prentice,R.L.,2002.TheStatisticalAnalysisofFailureTimeData, 2ndedition.Wiley.

Keeling,M.J.,2005.Modelsoffoot-and-mouthdisease.Proceedings.Biological Sci-ences272,1195–1202.

Kuusi,M.,Nuorti,J.P.,Maunula,L.,MinhTran,N.N.,Ratia,M.,Karlsson,J.,von Bonsdorff,C.H.,2002.AprolongedoutbreakofNorwalk-likecalicivirus(NLV)

(12)

gastroenteritisinarehabilitationcentreduetoenvironmentalcontamination. EpidemiologyandInfection129,133–138.

Kypraios,T.,2009.Anoteonmaximumlikelihoodestimationoftheinitialnumber ofsusceptiblesinthegeneralstochasticepidemicmodel.Statistics&Probability Letters79,1972–1976.

Lau,E.H.Y.,Yip,P.S.F.,2008.Estimatingthebasicreproductivenumberinthegeneral epidemicmodelwithanunknowninitialnumberofsusceptibleindividuals. ScandinavianJournalofStatistics35,650–663.

L’Ecuyer,P.,Simard,R.,Chen,E.,Kelton,W.,2002.Anobject-oriented random-numberpackagewithmanylongstreamsandsubstreams.OperationsResearch 50,1073–1075.

Liang,K.Y.,Zeger,S.L.,1986.Longitudinaldataanalysisusinggeneralizedlinear models.Biometrika73,13–22.

Lively,C.M.,2010.Anepidemiologicalmodelofhost–parasitecoevolutionandsex. JournalofEvolutionaryBiology23,1490–1497.

Lopman,B.,Armstrong,B.,Atchison,C.,Gray,J.J.,2009.Host,weatherand viro-logicalfactorsdrivenorovirusepidemiology:time-seriesanalysisoflaboratory surveillancedatainEnglandandWales.PLoSOne4,e6671.

Lopman,B.A.,Hall,A.J.,Curns,A.T.,Parashar,U.D.,2011.Increasingratesof gas-troenteritishospitaldischargesinUSadultsandthecontributionofnorovirus, 1996–2007.ClinicalInfectiousDiseases52,466–474.

Lopman,B.A.,Reacher,M.H.,Vipond,I.B.,Hill,D.,Perry,C.,Halladay,T.,Brown,D.W., Edmunds,W.J.,Sarangi,J.,2004a.Epidemiologyandcostofnosocomial gastroen-teritis,Avon,England,2002–2003.EmergingInfectiousDiseases10,1827–1834. Lopman,B.A.,Reacher,M.H.,Vipond,I.B.,Sarangi,J.,Brown,D.W.G.,2004b. Clin-icalmanifestationofnorovirusgastroenteritisinhealthcaresettings.Clinical InfectiousDiseases39,318–324.

Meyer,S.,Elias,J.,Höhle,M.,2012.Aspace-timeconditionalintensitymodelfor invasivemeningococcaldiseaseoccurrence.Biometrics68,607–616. O’Neill,P.D.,Marks,P.J.,2005.Bayesianmodelchoiceandinfectionroutemodelling

inanoutbreakofNorovirus.StatisticsinMedicine24,2011–2024.

RDevelopmentCoreTeam,2010.R:ALanguageandEnvironmentforStatistical Computing.RFoundationforStatisticalComputing,Vienna,Austria,ISBN 3-900051-07-0.

Rida,W.N.,1991.Asymptoticpropertiesofsomeestimatorsfortheinfectionratein thegeneralstochasticepidemicmodel.JournaloftheRoyalStatisticalSociety. SeriesB(Methodological)53,269–283.

Rosenthal,N.A.,Lee,L.E.,Vermeulen,B.A.J.,Hedberg,K.,Keene,W.E.,Widdowson, M.,Cieslak,P.R.,Vinjé,J.,2011.Epidemiologicalandgeneticcharacteristicsof norovirusoutbreaksinlong-termcarefacilities,2003–2006.Epidemiologyand Infection139,286–294.

Scallan,E.,Hoekstra,R.M.,Angulo,F.J.,Tauxe,R.V.,Widdowson,M.,Roy,S.L.,Jones, J.L.,Griffin,P.M.,2011.FoodborneillnessacquiredintheUnitedStates—major pathogens.EmergingInfectiousDiseases17,7–15.

Shao, Q.X., 1999. Some properties of an estimator for the basic reproduc-tionnumberofthegeneralepidemicmodel.MathematicalBiosciences159, 79–96.

Sukhrie,F.H.A.,Teunis,P.,Vennema,H.,Copra,C.,ThijsBeersma,M.F.C.,Bogerman, J.,Koopmans,M.,2012.Nosocomialtransmissionofnorovirusismainlycaused bysymptomaticcases.ClinicalInfectiousDiseases54,931–937.

Teunis,P.,Heijne,J.C.M.,Sukhrie,F.,vanEijkeren,J.,Koopmans,M.,Kretzschmar, M.,2013.Infectiousdiseasetransmissionasaforensicproblem:whoinfected whom?JournaloftheRoyalSocietyInterface10,20120955.

Thornley,C.N.,Emslie,N.A.,Sprott,T.W.,Greening,G.E.,Rapana,J.P.,2011. Recur-ringnorovirustransmissiononanairplane.ClinicalInfectiousDiseases53, 515–520.

Vignaux, T., Muller, K., Helmbold, B., 2012. SimPy Manual, Available at http://simpy.sourceforge.net

Venable,W.N.,Ripley,B.D.,2002.StatisticsComplementstoModernApplied Statis-ticswithS,4thedition,Availableathttp://www.stats.ox.ac.uk/pub/MASS4/ Voirin,N.,Roche,S.,Vanhems,P.,Giard,M.,David-Tchouda,S.,Barret,B.,Ecochard,

R.,2011.Amultiplicativehazardregressionmodeltoassesstheriskofdisease transmissionathospitalduringcommunityepidemics.BMCMedicalResearch Methodology11,53.

Wickham,H.,2009.ggplot2:ElegantGraphicsforDataAnalysis.Springer,NewYork. Wikswo,M.E.,Cortes,J.,Hall,A.J.,Vaughan,G.,Howard,C.,Gregoricus,N.,Cramer, E.H.,2011.Diseasetransmissionandpassengerbehaviorsduringahigh morbid-ityNorovirusoutbreakonacruiseship,January2009.ClinicalInfectiousDiseases 52,1116–1122.

Wilkinson,G.N.,Rogers,C.E.,1973.Symbolicdescriptionoffactorialmodelsfor analysisofvariance.AppliedStatistics22,392–399.

Yen,C.,Wikswo,M.E.,Lopman,B.A.,Vinje,J.,Parashar,U.D.,Hall,A.J.,2011.Impact ofanemergentnorovirusvariantin2009onnorovirusoutbreakactivityinthe UnitedStates.ClinicalInfectiousDiseases53,568–571.

Zelner,J.L.,King,A.A.,Moe,C.L.,Eisenberg,J.N.S.,2010.Howinfectionspropagate afterpoint-sourceoutbreaks:ananalysisofsecondarynorovirustransmission. Epidemiology21,711–718.

Zelner,J.L.,Lopman,B.A.,Hall,A.J.,Ballesteros,S.,Grenfell,B.T.,2013.Linking time-varyingsymptomatologyandintensityofinfectiousnesstopatternsofnorovirus transmission.PLoSONE8,e68413.

Figure

Fig. 1. Estimates versus number of outbreaks. The row names indicate parameters.
Fig. 5. Quantile–quantile plots of the cumulative exposure of people at the time of infection
Fig. 6. Quantile–quantile plots of the symptomatic periods. The semi-transparent lines connect data points from the same outbreak

References

Related documents

(2) Should the champion in an Association sanctioned championship contest fail to make the maximum weight limit of the division within one hour before the bout, then the Association

ƒ If facility rental clients plan to use a photograph of the Bechtler Museum of Modern Art or the Firebird on the event invitation, please submit the request for use to the

This level indicates how the data will be stored into the database and describes the file structures and data structures and methods to be used by the data base....

Sediment content of river inflows into reservoirs depletes the available storage capacity thereby reducing the benefits such as domestic and industrial water

the review describes different models of vocational rehabilitation, the effectiveness of various vo- cational rehabilitation programmes on work resumption or sick leave (where

In this evaluation, fuel usage in a test vehicle was measured with and without the Molecular Fuel Economizer installed in both the fuel and air inlet systems.. A second

This study focuses on how Scrum can be used to offer a distinctive feature in mitigating the effect of increased coordination costs in GSD projects by offering a solution