• No results found

Fitting outbreak models to data from many small norovirus outbreaks

N/A
N/A
Protected

Academic year: 2021

Share "Fitting outbreak models to data from many small norovirus outbreaks"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

University of Nebraska - Lincoln

DigitalCommons@University of Nebraska - Lincoln

USDA National Wildlife Research Center - Staff

Publications

U.S. Department of Agriculture: Animal and Plant

Health Inspection Service

2014

Fitting outbreak models to data from many small

norovirus outbreaks

Eamon B. O'Dea

University of Texas at Austin

Kim M. Pepin

Colorado State University

, [email protected]

Ben A. Lopman

Health Protection Agency

Claus O. Wilke

University of Texas at Austin

Follow this and additional works at:

https://digitalcommons.unl.edu/icwdm_usdanwrc

Part of the

Life Sciences Commons

This Article is brought to you for free and open access by the U.S. Department of Agriculture: Animal and Plant Health Inspection Service at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in USDA National Wildlife Research Center - Staff Publications by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln.

O'Dea, Eamon B.; Pepin, Kim M.; Lopman, Ben A.; and Wilke, Claus O., "Fitting outbreak models to data from many small norovirus

outbreaks" (2014).

USDA National Wildlife Research Center - Staff Publications

. 1939.

(2)

ContentslistsavailableatScienceDirect

Epidemics

jo u rn al h om ep age : w w w . e l s e v i e r . c o m / l o c a t e / e p i d e m i c s

Fitting

outbreak

models

to

data

from

many

small

norovirus

outbreaks

Eamon

B.

O’Dea

a,∗

,

Kim

M.

Pepin

b,c

,

Ben

A.

Lopman

d,1

,

Claus

O.

Wilke

a,e

aSectionofIntegrativeBiology,UniversityofTexasatAustin,1UniversityStationC0930,Austin,TX78712,USA bFogartyInternationalCenter,NIH,Bethesda,MD20892,USA

cDepartmentofBiology,ColoradoStateUniversity,FortCollins,CO80523,USA

dGastrointestinal,EmergingandZoonoticInfectionsDepartment,CentreforInfections,HealthProtectionAgency,LondonNW95EQ,UK

eCenterforComputationalBiologyandBioinformaticsandInstituteforCellandMolecularBiology,UniversityofTexasatAustin,Austin,TX78712,USA

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received11June2012

Receivedinrevisedform23October2013 Accepted23December2013

Availableonline8January2014

Keywords:

Stochasticepidemicmodel Parameterestimation Norovirus

Health-care-associatedinfection Generalizedlinearmodel

a

b

s

t

r

a

c

t

Infectiousdiseaseoftenoccursinsmall,independentoutbreaksinpopulationswithvarying

character-istics.Eachoutbreakbyitselfmayprovidetoolittleinformationforaccurateestimationofepidemic

modelparameters.Hereweshowthatusingstandardstochasticepidemicmodelsforeachoutbreakand

allowingparameterstovarybetweenoutbreaksaccordingtoalinearpredictorleadstoageneralized

lin-earmodelthataccuratelyestimatesparametersfrommanysmallanddiverseoutbreaks.Byestimating

initialgrowthratesinadditiontotransmissionrates,weareabletocharacterizevariationinnumbers

ofinitiallysusceptibleindividualsorcontactpatternsbetweenoutbreaks.Withsimulation,wefindthat

theestimatesarefairlyrobusttothedatabeingcollectedatdiscreteintervalsandimputationofabout

halfofallinfectiousperiods.Weapplythemethodbyfittingdatafrom75norovirusoutbreaksin

health-caresettings.Ourbaselineregressionestimatesare0.0037transmissionsperinfective-susceptibleday,

aninitialgrowthrateof0.27transmissionsperinfectiveday,andasymptomaticperiodof3.35days.

Outbreaksinlong-term-carefacilitieshadsignificantlyhighertransmissionandinitialgrowthratesthan

outbreaksinhospitals.

©2014TheAuthors.PublishedbyElsevierB.V.

Introduction

Acommonanddifficultprobleminepidemiologyistoestimate ratesofdiseasespread.Accurateestimatesoftheseandother pop-ulationparametersarecrucialintheevaluationofdiseasecontrol measures(AndersonandMay,1992;Keeling,2005;Halloranetal., 2009)orbiologicalhypotheses(Lively,2010).Heterogeneity com-plicatestheproblemofobtainingsuchestimates.Forexample,a person’sriskofinfectiondependsoncontactratesandacquired immunity,andthesequantitiescanvarywidelybetweenpeople andoutbreaks.

Norovirus(NoV)epidemiologyprovidesafinecaseinpointof theneedformodelstoaccommodateheterogeneity.Noroviruses arethe mostcommoncause ofdiarrheal diseasein theUnited

∗Correspondingauthor.Tel.:+12027847182.

1 Presentaddress:DivisionofViralDiseases,NationalCenterforImmunization

andRespiratoryDiseases,CentersforDiseaseControlandPrevention,Atlanta,GA 30030,USA.

States,causinganestimated21millioncases(Scallanetal.,2011) and 71,000 hospitalizations per year (Lopman et al., 2011). A geneticallydiversegroupofstrainsisoftencirculatingwithina population.New strainsof thepredominantgenogroup2 geno-type4(GII.4)taxonappearregularlyovertime(Glassetal.,2009), and a person’sriskof infection,given exposure, likely depends onboththeantigenicityofthevirusandthetype-specific immu-nitydevelopedfromtheperson’spreviousexposure(Cannonetal., 2009).Otherimportantheterogeneitiesinclude innate suscepti-bility(whichdepends onaperson’shisto-bloodgroupantigens and secretor status) and age-specific risks of exposure. Out-break investigations (Evans et al., 2002; Thornley et al., 2011; Wikswoetal.,2011)haveprovidedconvincingevidencethat sin-glevomitingincidentsincrowdedsettingscanleadtoscoresof secondarycases.Modelsthataccountforbothbetween-individual andbetween-populationheterogeneityareneededtoobtainthe accurate parameter estimates required for predicting outbreak dynamicsand implementingeffectivecontrols.Atpresent, con-trolmeasures arebased ongeneral infection-controlprinciples (Centers forDiseaseControlandPrevention,2011)andthusare likelytobesomewhatinefficient.

A further complicationfor modeling norovirus transmission isthat itoftenoccursin smalloutbreaks.Thetransmissionand

1755-4365©2014TheAuthors.PublishedbyElsevierB.V.

http://dx.doi.org/10.1016/j.epidem.2013.12.002

Open access under the CC BY-NC-SA license.

(3)

recoverytimesofcasesin smalloutbreaksarecorrelated(Rida, 1991),whichmakesestimationdifficultwhenusingdatafroma singleoutbreak.Anobvioussolutiontoreducingtheinaccuracy causedbywithin-outbreakcorrelationsindataistobaseestimates ondatafrommultipleoutbreaks.

Methods forestimating parameters frommultipleoutbreaks havebeendescribed before, butoften havebeendeveloped for smallerdatasetsandcomputingresourcesthanwhatarenow avail-able.Forexample,thepreviousapproachesofBecker(1979)and

Becker(1991)assumedonlytheobservationofthefinalstateof eachoutbreak wasavailable,usedmoments estimators,didnot formulategeneral a regression model to allowfor variation in parameters betweenoutbreaks,and may beimplemented with penciland paper.Our norovirus outbreak datasetincludesthe fullobservationofalargenumberofoutbreaksandanumberof covariatesthatarelikelytoaffectparameters.Wethushereemploy adifferentmethodthatoperates onthefullobservationof out-breaks,uses maximum-likelihoodestimators,models theeffect ofcovariatesonoutbreakparameterswithinageneralregression framework,and exploitsmoderncomputingpowertofind esti-matesandtheirconfidenceintervals.

Weproposeageneralapproachtofittingdatafrommanysmall outbreaks.Usingsimulateddata,weassessedtheperformanceof theproposedmethodasa functionofthenumberofoutbreaks inthedata,theroundingofmeasurementstoregularintervalsof observation,thenumberofmissingobservations,andthe imputa-tionofmissingobservations.Whenthenumberofoutbreakswas large,wefoundtheperformancetobesatisfactoryfordatasetswith realisticlevelsofallofthesechallengingfeatures.Fittingourmodel todatafromalargenumberofrealnorovirusoutbreaksin health-carefacilities, wefounda distinctincrease in transmissionand initialgrowthratesinlong-term-carefacilitiesrelativetohospitals. Weexaminedthefitofthemodelandfoundthemostnoticeable defecttobelower-than-observedpredictionoftheinitialgrowth oftheoutbreaks.However,thepredicteddynamicsbecamemore accurateovertime suchthatpredictionsnever deviatedwidely fromobservations.

Methods

Wedeveloped themethodsdescribed inthis sectiontofit a modeloftheoutbreakdynamicsofnorovirusbasedondatafroma largesurveyofgastroenteritisinhealth-carefacilitiesintheformer Countyof Avon, England.In this study,theeventsof symptom onsetandrecoverywererecordedonadailybasisforcasesof gas-troenteritisinbothcarestaffandpatientsin15hospitalsand135 long-term-carefacilitiesoverayear-longperiodin2002–2003,and theseeventswereclassifiedintoatotalof271separateoutbreaks (Lopman etal.,2004b).Theseoutbreakswereforthemostpart small;therangeintotalcasesspansfrom2to90casesandthe medianis13cases.

We begin by presenting our estimation methods. With the methoddefined,we thendescribe assumptionsand imputation proceduresusedtoprepareourdataforapplicationofthemethod. Tocompletethemodelspecificationforourapplication,wenext describethevariablesofthedatachosentobepredictorsofhow parametersvaryamongoutbreaks.Finally,weprovidedetailsabout methods of simulation, calculation of confidence intervals, and choiceofsoftware.

Model

Althoughouraimistointroduceageneralapproach,weaimto dosobywayofexample.Thuswedescribeourmethodsintermsof aspecificmodelchoicemadeforthenorovirusdata.However,we

doprovidereferencestorelevantresultsintheregressionliterature toindicatethefullscopeofthisapproach.

The states and transition rules for the model we adopt for individualoutbreaksareasfollows.Thepopulationconsistsofa fixednumberofpeopleofoneormoretypes.Thetermtypehere identifiespeoplebytherulesgoverningtheirmovementbetween differentstateswithrespecttonorovirusinfection.Atthe begin-ningofanoutbreak,thereissomepositivenumberofpeoplein anexposed,orlatent,stateforatleastoneofthetypes.Thisstate representspeoplewhohavebeenexposedtoaninfectionsource andhavealatentinfectionbutarenotcontagious.Theymoveto aninfectivestateafteranincubationperiodoffixedduration.The infectivestaterepresentscontagiouspeople,andforsimplicitywe assumethatallcontagiouspeoplearesymptomatic.Asusceptible

staterepresentspeoplewhoaresusceptibletoinfection.Thuseach susceptibleoftypeimovestothelatentstateatthefirstpointofa PoissonprocesswithrateˇiY(t),whereˇiisthetransmissionrate

fortype-isusceptiblesandY(t)isthenumberofinfectivesattimet. Allinfectivetypeshavethesamelevelofcontagiousnessandhave gamma-distributedsymptomaticperiodswiththesamedispersion parameter,butthemeansymptomaticperiodmaydifferbetween types.Further,typesthatrepresentcarestaffaremovedintoan

infective-but-removedstatewhenthetimetheyhavespentinthe infectivestateexceedsathresholdoffixedduration.Thistransition rulerepresentstheeffectofinfection-controlpoliciesthatprevent stafffromworkingwhencontagious. Attheend oftheir symp-tomaticperiods,infectiveand infective-but-removedpeopleare movedintoarecoveredstate.Therecoveredstaterepresents indi-vidualsthatgainimmunityoverthecourseoftheoutbreak.The outbreakendswhenthenumberofinfectedpeoplereacheszero.

In summary, our outbreak model is the widely studied susceptible-exposed-infective-recovered(SEIR) model with four customizationsforourapplication.First,weallowpeopletovaryin susceptibilityandexpecteddurationofinfectiousness.Second,we donotmakeourtransmissionratedependonthetotalnumberof peopleinthepopulation.Thisdeparturepreventstheneedforthe totalnumberofpeopletobeestimated,anditisappropriateinsmall populationswhenaninfectivepersonmaybeabletoinfectevery susceptiblepersoninthepopulationwithapproximatelythesame probability.Forexample,ForresterandPettitt(2005)didnotfind thatinclusionofthetotalpopulationsizesignificantlyimproved thefit of a modelof methicillin-resistant Staphylococcusaureus

(MRSA)outbreakswithinanintensive-careunit.Third,wedonot assumethatlatentperiodsandinfectiousperiodsareexponentially distributed.Ourapproachismorerealisticbecauseitallowsthe probabilityofapersonleavingalatentorinfectiousstatetodepend onhowlongshehasbeeninthatstate.Fourth,weshuntsomeof theinfectivesintoaninfective-but-removedstatetorepresentthe isolationofcontagiousstafffromthepopulation.

Asindicatedinouroutbreakmodeldescription,therateatwhich asusceptibleacquiresinfectionfromaninfectivemayvaryamong membersofapopulation,andweusethewordtypeinageneral sensetorefertosubsetsofthepopulationthatareassumedtobethe samewithrespecttosuchvariation.Withmultiple-outbreakdata, wefurtherdefinetypesasuniquetoindividualoutbreaks.Inother words,wemakenogeneralassumptionthatpeopleindifferent outbreaksmaybemodeledwiththesameparameters.Weshall laterchooseaparticularlinearmodelthatcontrolstheextentto whichparametersmayvaryamongtypes,butmanyotherchoices forsuchmodelsarepossiblewithinthisframework.Typesthus representthefundamentalunitofvariationinthisframework,and thelikelihoodfunctionnaturallybreaksapartintofactorsforeach type.

Foreachtype,therecovery-timeandtransmission-timepartsof thelikelihoodsfurtherfactorapartintocommondensityfunctions. Thesimplicityofthesefunctionsbeliesaninvolvedconstruction,

(4)

availableinKalbfleischandPrentice(2002),astheproductintegral ofthelikelihoodofeventsininfinitesimaltimesteps,wherethe likelihoodofeachtime stepisconditionalonthehistoryofthe modelupuntilthattimestep.Weshallintroducethefulllikelihood byintroducingeachofthesefunctionsinturn.

Fortype-ipeople,therecovery-timepartofthelikelihoodis

lrec(i,)= ki

j=1 1 (1/)(i)1/ I1/−1 i,j exp −Ii,j i, (1)

wherekiisthenumberoftype-ipeopleinfectedoverthecourse

ofanoutbreak,Ii,jdenotesthelengthofthesymptomaticperiodof

thejthtype-iinfection,iisthemeanofthesymptomaticperiodof type-iinfections,andisthedispersionparameter,whichwetake tobethesameforalltypesofinfections.Eq.(1)representsthe likeli-hoodfunctionforajointdistributionofgamma-distributedrandom variables.Recallthatperourmodeldefinition,thesymptomatic periodsIi,jaregammadistributed.

Thetransmission-timepartofthelikelihoodfortype-ipeopleis

ltr(ˇi,X(0)i )=X (0) i !/(X (0) i −ki)!exp[−ˇii(Xi(0)−ki)] × ki

j=1

ˇiYi,jexp(−ˇihi,j), (2)

whereXi(0)isthenumberofinitialsusceptibles,iisthecumulative exposureofsuchpeopleattheendofanoutbreak(i.e.,thetotalarea underY(t)),Yi,jisthenumberofinfectivespresentwhenthejthsuch

personbecomesinfected,hi,jisthecumulativeexposureofthejth

suchpersonwheninfected.Furtherdiscussionofthis likelihood functionisprovidedintheAppendix.

Inmanycases,convertingthedatatoaminimallysufficientform maybedesirableforthepreservationofpatientprivacy.An exam-pleofsuchaformwouldbetosummarizethedataas

jhi,j,

jIi,j,

ki,andiforeachtypei.Suchaformwouldstillallowfor

calcula-tionofmaximum-likelihoodestimatesandHessian-based(Wald) confidenceintervals.

Thisminimallysufficientformofthedataalsoillustratesthe robustnessofestimatestosomeimperfectionsofthedata.Thehi,j

andIi,jonlyaffectthelikelihoodthroughthesums

jhi,jand

jIi,j.

Thussomeerrorinourcalculationofhi,jandIi,jshouldnotbiasour

estimatestoomuchaslongastheaverageerrorisclosetozero,and thusgreatcertaintyaboutY(t)isnotnecessary.Forexample,ifwe underestimatedY(t)atsomepointsasaresultofasymptomatic infectivesbeingpresentin realityandwe overestimatedY(t)at someotherpointsasaresultofmisdiagnoses,thoseerrorsmay canceleachothertosomeextent.Also,sometimesthedata con-sistofonlythetimesatwhichpeoplestopbeinginfectious—for example,whenpeopleareisolatedafterbeingidentifiedas infec-tive.Insuchcasestotalexposurecouldstillbeestimatedbyusing akernel-smoothingmethod(LauandYip,2008).

Thelikelihood(2)canbeparameterizeddifferentlyas

ltr(ˇi,ri)=(ri/ˇi)!/(ri/ˇi−ki)!exp[i(ˇiki−ri)] ×

ki

j=1

ˇiYi,jexp(−ˇihi,j), (3)

whereri=ˇiXi(0) istheinitialper-infectiveincidencerate.Inour application,wechoosetoestimateriinsteadofXi(0)becauseriis

easiertointerpretinthecontextofourdata.Forbrevity,werefer toriastheinitialgrowthrate.

Thefulllikelihoodfunctionthatweuseforann-outbreakdata setisthen

l(ˇ,r,,)=

i

ltr(ˇi,ri)lrec(i,), (4)

whereweuseboldfacetodenotevectorswithelementsequalto theparametersforeachtypei.

Tomakeuseofpreviousresultsfromstatisticaltheoryaswell astouseconventionallanguagewhenwritingaboutourmodel,we shallnextpresentourmodelasageneralizedlinearmodel(GLM). GLMsareabroadclassofstatisticalmodelsthatincludesmany commonlyusedregressionmodels.AGLMconsistsofthree com-ponents:(i)adensityfunctionfromtheexponentialfamily,(ii)a linearmodelthatmapspredictivevariablestoapredictor,and(iii) alinkfunctionthatmapsthepredictortothemeanofthedensity function.

Ourlikelihoodfunctions,(1)and(3),fitthedefinitionof expo-nentialfamilydensities.Thatisnottosaythatthetransmission andrecoverytimesfromasmalloutbreakareindependentrandom variableswiththosedensities.Infact,theymaybehighlycorrelated (Rida,1991).ButthesituationisanalogoustothatofGLMsfor longi-tudinaldata,whereignoringwithin-subjectcorrelationsincreases thevarianceofestimatesbutstillleadstoaccurateestimatesinthe limitofdatafromalargenumberofindependentsubjects(Liang andZeger,1986).

Weobtainalinearmodelbyassociatingeachtypeofpersonin themodelwithasetofpredictivevariables.Intheapplicationto noroviruswedescribehere,suchpredictivevariablesare,for exam-ple,thetypeoffacilityinwhichanoutbreakoccurred(e.g.,hospital orlong-term-carefacility).Wecombinethesepredictivevariables intoadesignmatrixZ,whichhasarowforeachtypeianda col-umnforeachpredictivevariable.Thelinearmappingfrommultiple predictivevariablestoalinearpredictorisachievedbymultiplying thedesignmatrixwithavectorofregressionparametersc.

Aslink function,we chosethenatural log, which tendedto performbetterthanotherpotentiallinkfunctionsinour appli-cation. For example, for transmission-rate estimates ˇi, we let logˇi=Zi,*cˇ,whereZi,* isrowiofthedesign matrixandcˇare

ourregressionparametersforthetransmissionrates.

The conditions for consistency and asymptoticnormality of parameterestimatesforGLMshavebeengivenbyFahrmeir(1985). Foroutbreak data,thevaluesofpredictivevariablesin thedata arelikelytobesomewhatrandomlydetermined,inwhichcasethe conditionsgivenbyDingandChen(2006a,b)apply.

Onestandardconditionforconsistencyisthatthetruevalueof theparameterdoesnotlieontheboundaryofparameterspace. Thatconditionwouldseemtobeviolatedfordatasetsinwhichthe numberofcasesinanoutbreakkiisequaltothenumberof

suscep-tiblesX(0)

i becausethetransmission-timepartofthelikelihood,(2),

isdefinedonlywhenX(0)

i ≥ki.However,giventhatweapproximate

thediscretequantityXi(0)withacontinuousoneforthepurpose offittingthemodel,itseemsreasonabletoconsiderki−0.5asthe

lowerboundofX(0)

i andtosaythat

ltr(ˇi,Xi(0))=Xi(0)!/0!exp(0)× ki

j=1 ˇiYijexp(−ˇihi,j), whenX(0)

i isin(ki−0.5,ki].ThenthetruevalueofXi(0)isguaranteed

nottobeontheboundaryandstandardconsistencyresultsapply. IntheAppendix,weprovideanalternativeproofofconsistencyfor ourmodel inthesimplecasethat alloutbreakssharethesame parameters.

Evidencethatthemodelperforms wellin realisticsituations appears in the Results section. We were able to recover from

(5)

simulateddatatheparametersforthenon-trivialmodelthatwe fittedinourapplication.

Weestimatedthetransmissionrateandinitialgrowthrateby maximizingthetransmission-timefactorsin(4)giventheoutbreak data,usingtheNewton–RaphsonmethodasimplementedintheAD ModelBuilder(Fournieretal.,2011).TokeeptheNewton–Raphson searchformaximum-likelihoodestimatesinthefeasible parame-terspace,weaddedapenaltytotheloglikelihoodwheneverthe impliedfinalnumberofsusceptiblesx=X(0)

i −kiforanoutbreak

wastooclosetozero,x<

.ThepenaltywasoftheformC(x−

)2,

whereCisanarbitrarynumericconstantwhichwesettoC=0.01. Likewise,wheneverx<

,wereplacedxby

/(2−x/

).Throughout thiswork,weused

=0.001.

Weestimatedthemeaninfectiousperiodandthedispersionof theinfectiousperiodbyusingtheglmfunctioninR(RDevelopment CoreTeam,2010).Bydefault,thedispersionparameterforgamma GLMsisestimatedviathemomentsestimatefor thecoefficient of variation, perhaps because estimates based on the residual deviancearesensitivetosmallvaluesin thedata(Venable and Ripley,2002,p.9).Consistently,wetypicallyfounddefault esti-matestobemoreaccuratethanMLestimateswhenfittingsmall, simulateddatasetsandwefoundthemtobeverysimilartoML esti-mateswhenfittinglargedatasets.Tobeconsistentinourtreatment ofbothsmallandlargedatasets,weusedthedefaultestimateof thedispersionparameterthroughoutthiswork.

Data

The norovirus (NoV) data we analyze here originated in a prospectivesurveillanceprograminhospitalsandlong-term-care facilities in England (Lopman et al., 2004a,b).We analyzedthe dynamicsof75outbreakslaboratory-confirmedtobecausedby NoV in which a total of 1523cases of gastroenteritis occurred amongpatientsandstaff.Weselectedthesedatafromthelarger datasetproducedbythesurveillanceprogramasfollows.

Mostrecordsofinfectionsthatwereattributedinwholeorin parttonorovirusincludedthedatesofboththeonsetofandthe recoveryfromsymptoms.However,inmanyrecordsboth dates weremissing,andinmostoutbreakssomerecordslackedatleast onedate.

Wediscardedallrecordsfromoutbreaksinwhichmorethan55% ofthedatesofrecoveryweremissing.Intheremainingoutbreaks, we replaced missing dates of recoverywith thecorresponding onsetdateplus themediansymptomaticperiod fromcomplete recordsinthatoutbreak.Thesereplacementsweredoneasa prepa-rationfortheestimationofthetransmissionratesandwerenot includedwhenestimatingsymptomaticperiods.

Wediscardedallrecordswheretheonsetdatewasmissing.This practiceisunlikelytointroducealargebiasaslongasarelatively smallnumberofonsetdatesarediscarded.Wemadesurethatthis numberwasrelativelysmallbyusingdataonlyfromoutbreaksin whichthenumberofrecordsthatweremissingonsetdateswas lessthan7%ofthenumberofrecordsthatwerenotmissingonset dates.

Wemadeseveralsimplifyingassumptions.Weassumeda per-sonisinfectiveonlywhensymptomatic, whichissupportedby

Sukhrie et al. (2012). We further assumed that staff move to theinfective-but-removed state after one dayof symptoms, in accordancewithaninfectioncontrolpolicy.Ofcourse,staffwith norovirussymptomsarelikelytostopworkingsoonerthanthatin manycases.Butitseemslikelythatallinfectivestaffhavesome smallprobabilityoftransmittingthediseasetoothersbeforethey leave.Zelneretal.(2013)foundthathouseholdoutbreakdata sup-ported a modelin which infectiousness is highest at theonset of symptoms,which is oftenwhen vomitingoccurs. Aone-day

Table1

Distributionofnorovirusoutbreaksoverthelevelsofthepredictivevariablesinour regressionmodel.

Facility Season Sizeclass #outbreaks

LTCF Fall–winter Small 1

Large 3

Spring–summer Small 4

Large 4

Hospital Fall–winter Small 18

Large 23

Spring–summer Small 11

Large 11

infectiousperiodisasimplewayofmodelingtheeffectofpeople whomay,infact,onlybepresentforshorterperiodswithhigher infectiousness.

Wealsoassumedthatthelatentperiodisfixedat24h,which fallswellwithinthereportedrangeof12–48h(CentersforDisease ControlandPrevention,2011).Tokeepthetransmission-timepart ofthelikelihoodpositive,weassumedasmall,background haz-ardofinfection(10−8thatofaninfective)triggeredillnessincases

whennoinfectiveswerepresent.Wealsoassumedthatthe num-berofinitialinfectivepeoplewasequaltothenumberofpeople reportingsymptomsonthefirstdayoftheoutbreak.Finally,we assumedthatanychangesinstatehappenatthesametimeeach day.

Predictivevariables

Thepredictivevariablesthatdeterminedourdesignmatrices wereasfollows.Thedatawerecollectedoverthecourseofa one-yearperiodbeginninginApril2002,andwecategorizedthedata intotwogroupsbytheperiodinwhichtheybegan:spring–summer referstooutbreaksthatstartedbetweenApril1andOctober1of thestudyyear;fall–winterreferstooutbreaksthatbeganinthe remainderofthestudyyear.Theperiodvariableallowsforvariation intransmissionrateasaresultofseasonalityofNoV.

Asanadditionalpredictivevariable,weincludewhattypeof facilitytheoutbreakoccurredin,hospitalorlong-term-carefacility (LTCF).

Thethirdpredictivevariableweuseissizeclass.We classify unitsin which thenumberof bedsisless thanor equal tothe mediannumberofbedsassmall.We classifytheotherunitsas large.ThisclassificationwasdoneseparatelyforhospitalandLTCF unitsbecauseLTCFunitsareusuallylargerthanhospitalunits.For thehospitals,thesmallunitshave6–22bedsandthelargeunits have24–33beds.FortheLTCFunits,thesmallunitshave 6–34 bedsandthelargeunitshave36–66beds.Thesizeclassvariable allowsthenumberofinitialsusceptiblestodependonthe approx-imatetotalnumberofpeopleineachunit.Thevariablealsoallows populationsizestoaffectcontactrates.

Thefourthpredictivevariableweuseiscasetype,thetwotypes beingpatientandstaff.Casetypeistheonlypredictivevariablethat variedwithinoutbreaks.

We use a facility–size-class–period–case-type combination with a relatively largeamount of data as the reference group. Specifically,thereferencegroupcomprisesoutbreaksthatoccurred amongpatientsinlargecare-unitsofhospitalsthatbeganbetween October2002and April2003.Theestimatedrateparameterfor thereferencegroupservesasthecoefficientoftheinterceptof thelinearmodel.Estimatesforothercoefficientstheninformus ofhowmovingawayfromthereferencegroupchangesrate esti-mates.Table1containsthedistributionofoutbreaksamongthe levelsofthepredictivevariables.

We do not consider interaction terms in this work. So in summary, Wilkinson and Rogers(1973) notation for thelinear

(6)

predictors for all our parameters is: 1 +isLTCF +isSmall +isSpringSummer +isStaff.

Confidenceintervals

Toobtainconfidenceintervalsfortheestimates,weperformed a parametric bootstrap. Data were simulated according to our outbreakmodelwiththeestimatedparameters.Eachsimulation produced datafroma setof outbreaksequal in size totheset thatwefit,witheachoutbreakinthesimulationmatchingan out-breakinthefitteddatainterms ofinitialnumber ofinfectives, predictivevariables, fractionof caserecordswithmissing onset andrecoverytimes,andfractionofcaseswithmissingrecovery times.Percentileconfidence intervalsfor regression coefficients wereestimatedfrom10,000simulationreplicates.

Simulation

Weusedsimulationtoinvestigatehowthebiasandvariance ofourestimatesdependonthenumberofoutbreaksthattheyare basedonaswellastheamountofmissinginformation.Wealso usedsimulationtogeneratebootstrapconfidenceintervals.

SimulationsbeganwithsomeinitialnumbersX(0)

i oftype-i

sus-ceptibles of one or more types. To initiate theoutbreak, some additionalsusceptibleswereaddedandmovedintoalatentstate. Allpeopleenteringthelatentstatemovedtotheinfectivestate afterafixedtimeperiod.Type-ipeopleenteringtheinfectivestate movedontotherecoveredstateafteragamma-distributedtime periodwith meani and dispersionparameter .For types of infectivesrepresentingcaresstaff,movementtothe infective-but-removedstateprecededmovementtotherecoveredstateifand whentimespentintheinfectivestateexceededapredetermined threshold. Every time the number of infectives or susceptibles changed,thetimeofapotentialtransmissioneventwascalculated bydrawingfromanexponentialdistributionwithrateY

iˇiXi,

whereYisthenumberof(non-removed)infectivesandˇiisthe transmissionrateforsusceptiblesoftypei.Ifthepotential trans-missionwassoonerthanthenextchangeinY,atypeofsusceptible waschosenwithprobabilityproportionaltoˇiXiandmovedinto

thelatentstate.Simulationsstoppedwhenthenumberoflatent, infective,andinfective-but-removedpeoplereachedzero.The out-putofthesimulationswasarecordforeachpersoninfectedgiving transitiontimes.

Oursimulationexperimenthadafullfactorialdesign,withthe numberofoutbreaksnbeing1,10,or100;thefractionof recov-erytimesimputedbeingeitherzeroorapproximatelythehighest suchfractioninourrealdata(0.53);thefractionofrecordsmissing bothonsetandrecoverytimesbeingeitherzeroorapproximately thehighest suchfractioninourrealdata(0.05);andonset and recoverytimesbeingeitherroundedtodaysorexact.Foreach com-binationoffactorlevels,wesimulateddataandattemptedtofitit 10,000times.Thesesimulationshadjustonetype,justoneinitially infectedperson,atransmissionrateˇof0.0037,aninitialgrowth raterof0.2664,alatentperiodof24h,andinfectiousperiodswith ameanof3.32daysandadispersionparameterof0.58,and nothresholdtimebeyondwhichtheinfectivesweremovedintoan infective-but-removedstate.

Thedetailsofthesimulationofmissingdatawereasfollows. First,anoutbreakwassimulatedasusual.Second,thenumberof caserecordstoremovewascalculatedasthelargestintegerless thanthetotalnumberofcaserecordstimestheparametervalue forthefractionofrecordsmissingonsetandrecoverytimes.This numberofrecordswereselectedatrandomandremoved.Then thenumberofcase recordsforwhich toimputerecoverytimes wascalculatedasthelargestintegerlessthanthetotalnumberof

remainingcaserecordstimestheparametervalueforthefractionof recordsimputed.Thisnumberofrecordswereselectedatrandom andgivenimputedrecoverytimes.

Once-dailyobservationoftheoutbreakwassimulatedby round-ingtransitiontimesdowntothenearestwholeday.Outbreakswere startedatrandomtimesinthefirstdaytopreventtherounding fromhavingartificialeffectsonthedatafromsmalloutbreaks.

Ourgradient-basedoptimizationcodeformodelfitting,which workedwellatestimatingtransmissionrateparameterswhenthe numberofoutbreakswaslarge,didnotworkwellwhenthenumber ofoutbreakswassmall.Soweusedspecializedcodetofitthe mod-elsofthesimulationstudy,whichweremoreanalyticallytractable byvirtueofnothavinglinearpredictors.TheAppendixdescribes thebasisforthiscode,whichalwaysfindsthemaximum-likelihood estimateifitexistsandidentifiescasesinwhichnosuchestimate exists.

Software

OuroutbreaksimulationcodemadeuseoftheSimPy(Vignaux etal.,2012)pythonmodule.TheRngStreamsClibrary(L’Ecuyer etal.,2002)allowedforthesimulationstoruninparallel.Weused theADModelBuilder(Fournieretal.,2011)andR2admb(Bolker andSkaug,2011),anR(RDevelopmentCoreTeam,2010)interface forit,tooptimizetheloglikelihood.Wepreparedgraphicswiththe Rpackageggplot2(Wickham,2009).Codecapableofreproducing theresultsisavailablefromtheauthorsonrequest.

Results

Wefirstpresentasimulationstudythatillustratestheaccuracy ofourmethodsonalargenumberofsmalloutbreaksthatwere simulatedfromourmodel.Withthatself-consistencytestpassed, wethenpresentestimatesfromtheapplicationofourmethodto thenorovirusdata.Toshowtheextenttowhichourmodelwas appropriateforthenorovirusdata,wethenpresentdiagnosticsof thefit.

Simulation

Weusedsimulationtoseehowmanyoutbreakdatasetsmay berequiredforestimatestobeapproximatelynormallydistributed aroundthetrueparametervalues.Thesimulationsalsoallowedus togaugetheeffectsoftheimputationandroundingnecessaryfor ourapplication.

Muchpreviousworkhasshownthatestimationwithdatafroma single,smalloutbreakisunreliable(Shao,1999,andrefs.therein). Thusonebenefitofaggregatingdatafrommultipleoutbreaksis thatitallowsfordatafromminoroutbreakstoproducereliable estimates.However,usingdatafromminoroutbreaksdoes repre-sentaworst-casescenariointhesensethateach suchoutbreak contributesonlyasmallamountofinformation.Forthosetwo rea-sons,andtokeepthesimulationstudyatamanageablesize,we restrictedoursimulationstoonesetofparametersthatis guaran-teedtoresultinsmalloutbreaks.Toallowforcomparisonwithour fitstothenorovirusdata,weusedtheparametersestimatedforour baselineregressiongroup.

Asexpected,theestimateswerenotverygoodwhenusingdata fromsingleoutbreaks(Fig.1).Inabout49%ofthesesimulations, theinitialinfectivefailedtoinfectanyone,limitingestimationto thelengthofthesymptomaticperiod.Inabout13%ofthese simu-lations,onlyonetransmissionoccurredandthetransmissionand growthrateparameterswereunidentifiable.Inabout21%ofthese outbreaks,theestimateofrwasonthelowerboundof parame-terspace,preventingcalculationofWaldconfidenceintervals.In theremaining17% ofreplicates,thecoverageprobabilityofthe

(7)

Table2

Simulationresultsfortransmissionrateˇandinitialgrowthrater.ndenotesthenumberofoutbreakssimulatedforanestimate.Imputedreferstothefractionofrecovery timesdeletedandthenimputedasdescribedintheMethodssection.Missingreferstothefractionofcaserecordsdeletedbeforefittingthedata.Roundedindicateswhether theonsetandrecoverytimeswereroundedtowholedays.Inthesimulations,ˇwassetto0.0037transmissionsperinfective-susceptibledayandtherwassetto0.2664 transmissionsperinfectiveday.

n Imputed Missing Rounded Bias( ˆˇ) Av.s.e.( ˆˇ) ˇcover.(%) Bias(ˆr) Av.s.e.(ˆr) rcover.(%)

1 0.00 0.00 0 0.198 0.131 82 0.52 0.834 87 1 0.196 0.129 82 0.52 0.830 88 0.05 0 0.21 0.130 81 0.55 0.84 87 1 0.206 0.127 83 0.53 0.83 90 0.53 0.00 0 0.251 0.132 80 0.67 0.958 81 1 0.234 0.137 80 0.65 0.966 81 0.05 0 0.231 0.149 83 0.63 0.98 84 1 0.214 0.134 81 0.61 0.96 84 10 0.00 0.00 0 0.0154 0.0456 89 0.0389 0.2882 95 1 0.0155 0.0457 90 0.0378 0.2888 95 0.05 0 0.0160 0.0471 90 0.0370 0.2932 96 1 0.0160 0.0470 89 0.0373 0.2919 96 0.53 0.00 0 0.0158 0.0525 93 0.0602 0.3116 91 1 0.0162 0.0533 93 0.0608 0.3134 91 0.05 0 0.0168 0.0558 92 0.0583 0.319 92 1 0.0170 0.0562 93 0.0587 0.319 92 100 0.00 0.00 0 0.00055 0.00571 94 0.0027 0.07380 95 1 0.00053 0.00569 94 0.0025 0.07370 95 0.05 0 0.00066 0.00614 94 −0.0001 0.07533 95 1 0.00067 0.00617 93 −0.0004 0.07535 95 0.53 0.00 0 −0.00033 0.00668 96 0.0195 0.08036 82 1 −0.00034 0.00653 96 0.0131 0.07844 86 0.05 0 −0.00018 0.00724 97 0.0174 0.08222 85 1 −0.00024 0.00707 97 0.0103 0.08018 87

95%Waldconfidence intervalsrangedfrom80to90%(Table2) andthebiasandaveragestandarderrorforthetransmissionrate wasalmost100timesthetruevalueoftheparameter.The aver-agecorrelationbetweenthetransmissionrateandinitialgrowth rateestimates was94%. Estimates for thesymptomatic period, althoughobtainedforallreplicates,werealsonotaccurate(Fig.1

andTable3).

Rounding,deleting 5% ofcase records,and imputing53% of recoverytimesallgenerallyincreasedtheaveragestandarderror ofestimates,witheffectsinthatorder.Effectsonthebiaswere somewhatmorevariable,buttheasymptoticeffectsofthese pro-cedureson the bias appears tobe zero. However, even in the 100-outbreakscenariotheimputationcausedcoverage probabil-ities forr todeviate byas manyas 13 percentage pointsfrom 95%(Table2),whichrecommendstheuseofconfidenceintervals thataccountfortheimputation,suchastheonesweusedinour application.

Onthewhole,theestimatesweremuchmoreaccurateinthe 10-and100-outbreakscenarios(Fig.1,Tables2and3).Theywerealso morerobust.Estimatesforrwereonthelowerbound5%ofthetime inthe10-outbreakscenarioandneveronthelowerboundinthe

100-outbreakscenario.Thelikelihoodwasdivergentabout7–10% ofthetimeinthe10-outbreakscenarioversus0.1–2%ofthetime inthe100-outbreakscenario.Theaveragecorrelationbetweenthe estimatedtransmissionrateandgrowthratewasabout0.83and 0.74forreplicatesinthe10-and100-outbreakscenarios, respec-tively.

Insum,themethodworkswellwithasufficientlylargedata set.Moderateamountsofimputation,missingdata,androunding willhavemostlymodesteffectsonestimates.Simulation,aspart ofaparametricbootstrapprocedure,canprovideanindicationof theaccuracyofestimatesforaparticulardatasetofinterest.We demonstratesuchaprocedureinourapplication.

Estimatesfornorovirusinhealth-caresettings

We fitted our generalized linear model to data from a largeprospectivestudyofgastroenteritisinhealth-caresettings (Lopmanetal.,2004b).Inthisone-yearstudy,patientsandthecare staffassignedtoanyofabout4500bedsinhealth-carefacilities intheformerCountyofAvon,England,wereunderactive surveil-lance.Trainedstaffmembersrecordedthedatesoverwhichpeople

Table3

Simulationresultsforsymptomaticperiodmeananddispersionparameter.ndenotesthenumberofoutbreakssimulatedforanestimate.Missingreferstothefraction ofcaserecordsdeletedbeforefittingthedata.Roundedindicateswhethertheonsetandrecoverytimeswereroundedtowholedays.Cover.referstothecoverageprobability ofWaldconfidenceintervals.Lowerˆandupperˆrefertotheboundsofabootstrapconfidenceinterval.Inthesimulations,wassetto3.32daysandthewassetto0.58.

n Missing Rounded Bias(ˆ) Av.s.e.(ˆ) Cover.(%) Bias(ˆ) Lowerˆ Upperˆ

1 0.00 0 −0.60 3.76 83 −0.067 0.01 1.46 1 −0.59 3.75 84 −0.037 0.00 2.00 0.53 0 −0.65 4.13 80 −0.096 0.01 1.47 1 −0.59 4.26 80 −0.027 0.00 2.00 10 0.00 0 −0.104 1.511 91 −0.005 0.31 0.98 1 −0.106 1.534 91 0.008 0.32 0.98 0.53 0 −0.108 2.123 90 −0.016 0.23 1.12 1 −0.106 2.161 90 0.002 0.24 1.15 100 0.00 0 −0.010 0.4703 94 −0.0006 0.49 0.68 1 −0.010 0.4772 95 0.0157 0.50 0.70 0.53 0 −0.007 0.6853 94 −0.0012 0.46 0.73 1 −0.010 0.6928 94 0.0132 0.47 0.76

(8)

Fig.1.Estimatesversusnumberofoutbreaks.Therownamesindicateparameters. Eachsmallblackpointrepresentsanestimate.Thelargergraypointsrepresentthe meansoftheestimates.Thehorizontallinesrepresentthevaluesoftheparameters usedtosimulatethedata.

weresymptomaticandtooksamplesthatallowedforlaboratory confirmationofthecausesofoutbreaks.Fig.2showsthecase his-toriesthatwereusedtofitourmodel.

Thepredictorsinourmodelwerefacilitytype,whichindicated whetheranoutbreaktookplaceinalong-term-carefacility(LTCF) orahospital;sizeclass,whichindicatedthenumberofpatientsand staffintheunit;period,whichindicatedthetimeofthestudyyear whentheoutbreakbegan;andcasetype,whichindicatedwhether acasewasapatientoramemberofthecarestaff.SeetheMethods

sectionformoredetails.

Forourbaselineregressiongroupofpatientsinlargehospitalsin thefallandwinter,theestimate(95%bootstrapconfidenceinterval) ofthetransmissionratewas0.0037(0.0026–0.0052)transmissions perinfective-susceptibleday,thatoftheinitialgrowthratewas 0.27(0.23–0.30)transmissionsperinfectiveday,thatofthe symp-tomaticperiodwas3.35(3.09–3.57)days,andthatofthedispersion parameter for the symptomatic period was0.57 (0.54–0.65). Thoseparameterestimateshavebeentransformedfromthelog scaleforeaseofinterpretation.Thefullsetofuntransformed esti-matesisgiveninTable4.

The basic reproduction number R0 is the expected

num-ber of new infections that a single infection will cause at the beginning of an outbreak. Table 5 contains values of R0

calculated from the regression coefficients using the formula

R0=(rpatient+rstaff)patient.

0 0 0 10 0 10 −10 0 10 −10 0 10 20 0 10 20 0 10 20 30 0 10 20 30 40 50 60 70 0 0 20 20 2 0 2 0 2 4 0 2 4 0 2 4 6 8 0 2 4 6 8 10 Time (weeks) Case ID n umber

Fig.2. Casehistories.Eachhorizontalbarrepresentsthehistoryofaperson.The symptomaticperiodisfilledin.CaseIDswereassignedbysortingthecasesfirstby onsettime,thenbyrecoverytime,andthenbyarandomordering.Initialinfectives weregivennegativecaseIDs.Thepanelsarearrangedsothattheoutbreaksize increasesfromtoptobottomandtheoutbreaklengthincreasesfromlefttoright. Casehistoriesfromlong-term-carefacilities(LTCFs)areinlightgray.Someofthe timesofrecoveryfromsymptomswereimputedasdescribedintheMethodssection.

Fig. 3 shows estimates for the effect on model parameters ofan outbreakbeingdifferent fromtheoutbreaksin the refer-encegroupwithrespecttooneof ourpredictivevariables.The largesteffectsaretheincreaseintransmissionandgrowthsrates in long-term-care facilities (LTCFs) and the reduction in these ratesinstaff.Itappearsthattransmissionratesarehigherinthe smallerunits.Symptomaticperiodswereestimatedtobeabout 25%shorterforoutbreaksinLTCFsand20%shorterforcasesamong staff.

Diagnostics

Asa general test of modelfit for the transmissionrate and growthratelikelihoods,wecalculatedthepercentileofthelog like-lihoodofthefittotherealdatainthedistributionofloglikelihoods generatedbybootstrapping.Outof10,000bootstrapreplicates,our optimizationcodefoundestimatesin9809cases.Theloglikelihood ofthefittotherealdatawasinthe25thpercentileofthelog like-lihoodsfromtheseestimates.Thus,thelog-likelihoodofourfitto therealdataisnotextreme,consistentwithagoodmodelfit.

Ouruseofthemomentsestimatorforthedispersion param-eterin(1)precludedasimilarassessmentofmodelfitforthe

Table4

Regressioncoefficients(95%bootstrapconfidenceinterval)forthenaturallogoftheoutbreak-modelparameters.

Transmissionrate Initialgrowthrate Symptomaticperiod

(Intercept) −5.60 (−5.96,−5.25) −1.31 (−1.45,−1.20) 1.21 (1.13,1.27) LTCF 1.98 (1.61,2.35) 1.06 (0.87,1.20) −0.28 (−0.37,−0.18) Small 0.45 (0.34,0.70) −0.03 (−0.15,0.11) 0.08 (0.00,0.16) Spring–summer −0.06 (−0.29,0.14) 0.04 (−0.09,0.17) 0.00 (−0.08,0.09) Staff −1.03 (−1.43,−0.74) −0.87 (−1.00,0.73) −0.23 (−0.31,−0.14) Dispersion 0.57 (0.54,0.65) Num.obs. 1523 1412 Loglikelihood -1625 -2896

(9)

Table5

Predicted R0 values (95% bootstrap confidence interval) based on regression coefficients.

Facility Season Sizeclass R0

LTCF Fall–winter Small 2.78 (2.55,3.23)

Large 2.62 (2.40,2.98) Spring–summer Small 2.90 (2.69,3.29) Large 2.74 (2.53,3.04) Hospital Fall–winter Small 1.27 (1.22,1.43) Large 1.20 (1.14,1.33) Spring–summer Small 1.33 (1.26,1.53) Large 1.25 (1.19,1.41)

symptomaticperiods.However,inspectionofthedefaultdiagnostic plotsforglmobjectsinRdidnotindicateanyproblems.

Theestimatedvalueofthelogoftheinitialnumbersof suscep-tibles(calculatedaslog ˆX(0)

i =log( ˆri/ˇˆi))providedameansfora

sanitycheckofourestimates.AsdescribedintheMethodssection, oursize-classpredictivevariablewasdeterminedfromthe num-berofbedsorstaffassignedtoaunit.Wechosenottodirectlyuse thenumbersofbedsorstaffastheinitialnumberofsusceptiblesin ourmodelbecausetheyarelikelynoisymeasurementsofthetrue value.However,itisworthnoticinginFig.4thatourestimates areonapproximatelythesamescaleasthenumbersofbedsand staff.Furthermore,ourestimatesreplicatetwoqualitative differ-encesseeninthebed-numberandstaff-sizedata.First,unitsinthe smallsizeclassdoindeedhavesmallerlog ˆX(0)

i thandounitsin

thelargesizeclass.Second,log ˆX(0)

i forstaffarelargerthanthose

forpatients.Ontheotherhand,log ˆX(0)

i islargerforhospitalsthan

forLTCFswhileLTCFshavemorebedsandlargerstaffs.This differ-encemightbereducedbyincreasingthecomplexityofourmodel, butsuchanexerciseinmodelselectionisbeyondthescopeofthis paper.

LTCF Small summerSpring− Staff

0.25 0.5 1 2 4 8 16 0.25 0.5 1 2 4 8 16 0.25 0.5 1 2 4 8 16 Tr ansmission rate Initial gr o wth rate Symptomatic pe riod 0 2000 0 2000 0 2000 0 2000 Count Multiplicati v e ef fect

Fig.3.Regressioneffectestimates.Thecolumnnamesindicatelevelsofthe cate-goricalpredictivevariablesbywhichoutbreaksdifferedfromthereferencegroup. Thereferencegroupwasmadeupofallcasesofnorovirusamongpatientsinlarge hospitalsinthefallandwinter.Therownamesindicateparameters.Thehistograms displaythedistributionsofestimatesobtainedinaparametricbootstrap.Gray rec-tanglesindicatea95%confidenceintervalbasedonthepercentilesofbootstrap estimates.ThewhitehorizontallineinsideeachrectangleindicatestheMLestimate. LTCFstandsforlong-term-carefacility.

Small Large 10 100 10 100 LTCF Hospita l

Patient Staff Patient Staff Case type

Initial susceptib

les

Fig.4. Estimatedinitialsusceptiblesandmeasuresofcare-unitsize.Thecolumn names,rownames,andx-axismarksindicatethelevelsofthepredictivevariables thatdeterminedtheestimateofinitialsusceptiblesrepresentedbythegraybars. Themiddle95%ofestimatesfromtheparametricbootstrapwereusedtodetermine theupperandlowerboundsofthegraybars.Eachpointcorrespondstoanobserved outbreakofNoVandindicateseitherthenumberofbedsforpatientsinthecare unitorthenumberofstaffassignedtoaunit.LTCFstandsforlong-term-carefacility. Estimatesofinitialsusceptiblesdidnotvarymuchbytheperiodpredictivevariable, soestimatesanddatafrombothperiodsaregroupedtogetherinthisplot.

Tographicallyevaluatethefitofourmodel,weplottedobserved values of cumulative exposure before infection, symptomatic periodduration,andcumulativeincidenceagainstthoseexpected fromthefittedmodel(Figs.5,6,and7).

Fig.5.Quantile–quantileplotsofthecumulativeexposureofpeopleatthetime ofinfection.Thesemi-transparentlinesconnectdatapointsfromthesame out-break.Theopaquediagonallineisareferencelinethatindicateswherepointsfrom perfectlymatchingdistributionswouldfall.

(10)

Fig.6.Quantile–quantileplotsofthesymptomaticperiods.Thesemi-transparent linesconnectdatapointsfromthesameoutbreak.Theopaquediagonallineisa referencelinethatindicateswherepointsfromperfectlymatchingdistributions wouldfall.

FromFig.5,itseemsthatthemodelfitsbestforintermediate valuesofcumulativeexposureintherangeof50–200 infective-persondays.Inthisrange,thelinesconnectingdatapointsfrom individualoutbreaksareroughlyparalleltoandcenteredaboutthe referenceline,whichshowsthatinthisrangetheobserved distri-butionsaresimilarinshapetotheexpecteddistributionsbutwith probabilitydensitiesshiftedtotheleftorright.Forlowvaluesof cumulativeexposure,theobserveddistributionshavemuchhigher densitiesthanthefittedmodelwouldpredict.Thisdisagreement

Fig.7.Observedversusexpectedcumulativecasecounts.Thesemi-transparent linesconnectdatapointsfromthesameoutbreak.Theopaquediagonallineisa referencelinethatindicateswherepointsfromperfectlymatchingfunctionswould fall.

isvisibleasthehighdensityofpointsbelowthereferenceline. Forhighvaluesofcumulativeexposure,theobserveddistributions seemtobetruncatedataround220infective-persondays,whereas thefittedmodelassumesanexponentialdistributionthatpredicts muchlargervalues.

InFig.6,observedsymptomaticperiodsaregenerallycloseto predictedsymptomaticperiods.However,thereseemstobea ten-dencyforsymptomaticperiodsinsomeoutbreakstobelessvaried thanthemodelpredicts.Thistendencyisvisibleaslinesthatstart abovethereferencelineontheleftsideoftheplotandthenbend totherightquicklyenoughtobeunderthereferencelinebythe timetheyend.

InFig.7,cumulativeincidenceishigherthanexpectedearlyin outbreaksasaconsequenceofthelarger-than-predictednumberof caseswithsmallexposure(Fig.5).Thedifferencebetweenobserved andexpectedcumulativeincidencetendstobecomelesspositive orslightlynegativebytheendofoutbreaks,andtheabsolute dif-ferenceisusuallylessthan10throughout.

Discussion

We haveshown thatreliable estimationof parametersfrom manysmalloutbreaksispossibleusingageneralizedlinearmodel basedonstandardstochasticepidemicmodels.Asimulationstudy demonstratedthatweareabletoaccuratelyestimateparameters whenthedatastemfromsmalloutbreaksevenwhensomedataare missingandabouthalfofrecoverytimesareimputed.Fittingthe modeltoalargenumberofoutbreaksofnorovirus,wefoundthat facilitytype,facilitysize,andcasetypeseemtohavesignificant effectsonoutbreakdynamics.

Höhle(2009)describedahighlygeneralformulationof stochas-ticepidemicmodelswithinaregressionframework,andourmodel isalmostaspecialcaseofthatgeneralformulation.However,that formulationdidnotinclude regressionsforthemeaninfectious periodortheinitialgrowthrate,whichweincludehere.Modeling detailsaside,ourworkherediffersfromHöhle(2009)andrelated regressionapproaches (Forresterand Pettitt,2005; Voirinetal., 2011;Meyeretal.,2012)indemonstratingtheparticularvalueof amultiple-outbreakregressionwhenfittingdatafromsmall out-breaks,whicharequitecommoninhealth-caresettings.

Themost striking resultof ourregression estimates(Fig.3) aretheapproximately7-foldincreaseintransmissionratesand 3-foldincreaseininitialgrowthratesinthelong-term-carefacilities (LTCFs)relativetohospitals.Fig.2showsthatLTCFoutbreaksdo indeedincludemanyofthelargerandfastergrowingoutbreaksin thedataset.

ThehighertransmissionratesforoccupantsofLTCFsmaybe aconsequenceofoccupantshavingmoreopportunitytosocialize inlargegroups.Alternatively,wemaybeseeingtheeffectsofour assumptionsofaclosedpopulationandhomogeneousmixingbeing violated.Hospitalshavemorerapidturnoverofpatients,andthe exposureofpeoplewhoarrivedinthecareunitaftertheoutbreak startedwillbeoverestimatedinourmodel.OccupantsofLTCFsmay varymoreincontactratesbyvirtueofbehavioraldifferences,and suchvariationinexposurecouldleadtoahigherinitialgrowthrate (Becker,1989,pp.133–138).

Our model is agnostic about the particular pathways of transmission in the outbreaks, and thus the estimates repre-sent rates of transmission by all routes including transmission byperson-to-person, environmentally-mediated, and foodborne routes.Outbreakinvestigationsoftenassociatefoodborne trans-missionwithanabruptincreaseininfections(e.g.,Isakbaevaetal., 2005),andthusdifferenceinthefrequencyofthismodeof trans-missionbetweensettingscouldcausedifferencesinourestimates. Thatbeingsaid,wedidnotincludeinfectiontermscorresponding

(11)

tofoodbornetransmissionbecausemostreportednorovirus out-breaksareattributedprimarilytoperson-to-persontransmission (Yenetal.,2011),andwebelievedthatalltheoutbreakswe ana-lyzedinvolvedprimarilyacombinationofperson-to-personand environmentally-mediatedtransmission.Incrowdedsettingssuch ashealthcare-facilities,itisoftendifficulttodeterminethe rela-tivefrequencyoftheseroutesoftransmission(Kuusietal.,2002; Isakbaevaetal.,2005;CentersforDiseaseControlandPrevention, 2008).

Althoughouraimwasnottofitthedatatoahighlyrealistic model,wedidtakeasignificantsteptowardsrealismbymodeling variationintheinitialgrowthrateoftheoutbreak.The transmis-sionrateinourmodeldetermineshowtheexpectednumberof newcasesincreasesastheproductofthenumbersofinfectives andsusceptiblesincreases.Thusestimatesofthetransmissionrate willbehighlysensitivetothoseoftheinitialnumberof suscepti-bles,whichdeterminesthenumberofsusceptiblesthroughoutthe outbreak.Butinthecaseofnorovirus,thenumberofsusceptibles isdifficulttoknowasthereisnoserologicalcorrelateof protec-tion.Recentworkonjointestimationof transmissionrates and theinitialnumberofsusceptibleswithdatafromasingleoutbreak (Hayakawaetal.,2003;Hugginsetal.,2004;LauandYip,2008; Kypraios,2009)hasshownthatestimatesoftheinitialnumberof susceptiblestendtobelowwhendatasetsaresmall.Aswehave seeninoursimulationresults(Fig.1),thisbiasdecreasesasthe numberofoutbreaksinthedatasetincreases,evenifalloutbreaks aresmall.

Theestimates for NoVtransmission dynamicswe calculated complementresultsfrompreviousepidemiologicalanalysesofNoV inhealth-caresettings.Previousanalysesofourdataset(Lopman etal.,2004a,b)hadexaminedhowriskofNoVinfectionor particu-larsymptomsofNoVinfectionvariedwithageandotherpersonal characteristics.Thecurrentanalysisaddstotheseresultsby quan-tifyingeffects that couldbeusedtopredictnorovirus outbreak dynamics.

Analysisofa2003–2006studyofNoVoutbreaksin long-term-carefacilities(LTCFs)inOregon(Rosenthaletal.,2011)suggested thatlargerfacilitiesmayhaveahigherriskofexperiencing out-breaks. Our result that transmission rates are lower in larger facilitiessuggeststhatanyincreasedriskthatlargerfacilitieshave isnotcausedbyincreasedtransmissionrates.However,ourresult mustbeinterpretedwithcautionbecausewehavenotbeenable toaccountformanyfactorsthatmayaffectcontactrates,suchas numberofbedsperroom.Whetherthecauseofthelower trans-missionratesisreallyfacilitysizeoracorrelationoffacilitysize withsomeomittedvariableisunclear.

Afewpreviousstudieshaveestimatedindividual-level param-etersfor NoV thatare comparabletoourestimates. Usingdata froma NoV outbreak in a primary schooland nursery in Der-byshire, England, O’Neill and Marks(2005) estimated that the probability of a susceptible person avoiding infection from an infectivepersonintheschoolforadaywas0.998.Usingthe for-mulaPr(avoidance)=exp(−ˇ×1susceptible×1infective×1day), ourestimatesyieldPr(avoidance)thatrangesfromabout0.959for patientsinsmallLTCFsto0.999forstaffinlargehospitals.

Heijneetal.(2009)estimatedthebasicreproductionnumberof NoVinboy-scoutcampstobeabout14and7,respectively,under twodifferentsets ofassumptions. Our highestR0 was

approxi-matelyequalto3(Table5).TherelativelownessofourR0smight

reflectcontactratesbeinghigherinthecampsetting,anditmay alsoreflect theeffectof betterhygiene in thehealth-care sett-ings. Heijne etal. (2009)estimatedthat theimplementationof an enhanced hygiene protocol drove the reproduction number in the camps down toabout 2 and 1, values on par with our ownestimates.Likewise,Heijneetal.(2012)estimated reproduc-tionnumbersforanoutbreakwithinthewardsofapsychiatric

institutiontobeclosetooneandthusclosetoourhospitalR0s.

Actually,thoseestimatestendtobelowerthanours,buttheyare reproductionnumbersaveragedoverthefullcourseofanoutbreak andthusshouldexpectedtobelowerduetothedepletionof sus-ceptiblesovertime.Ourestimatesmaybemoregeneralizablethan theboththepsychiatric-wardandboy-scoutestimatesbecauseour datasetwaslargerandincludeddatafrombothlargeandsmall outbreaks.

Zelneretal.(2010)useddatafromaStockholmoutbreakto esti-matethattheaverageinfectiousperiodwas1.2days.Thesettingof theseoutbreakswashouseholdsthatincludedchildrenindaycare centers.Thus,theinfectiousperiodmayhavebeenshorterinthese outbreaksbecausemanyoftheinfectiveswerelikelyhealthy peo-plebetweentheagesof5and70,whereaspeoplebelowtheageof5 and,toanevengreaterdegree,peopleovertheageof70were over-representedinourdata(Lopmanetal.,2004b).Inourdata,people intheseextremeagegroupshadaveragesymptomaticperiodsof 3days(Lopmanetal.,2004b).Moreover,theStockholmestimate isbasedonimputedinfectiousperiodsratherthansymptomatic periods,whichwerenotreported.Asaresult,iftheassumedinitial numberofsusceptiblesfortheStockholmanalysiswastoohigh, theinfectiousperiodwouldhavebeenunderestimated.

Althoughourestimatesofthesymptomaticperiodmaybe rela-tivelylong,itispossiblethatsomeofthepatientsweredischarged intothecommunitybeforetheybecameasymptomatic.Thus,for patients,ourestimatesmostaccuratelydescribetheperiodofbeing symptomaticwhilesimultaneouslybeinginahealth-carefacility.

ThedailytransmissionratesestimatedfromtheStockholmdata, 0.14transmissionsperinfective-susceptibleday,aremorethan 3-foldhigherthanourhighestestimatedtransmissionrate,which was0.04transmissionsperinfective-susceptibledayforpatients insmallLTCFs.Thejointestimationapproach weusedcouldbe appliedtotheStockholmdatatodeterminewhetherthehigher transmission-rateestimatesmayhaveresultedfrom underestima-tionofhouseholdsizes.

However,thetransmissionratesmaywellbedifferentbecause ofdifferencesinhygienemeasures,contactrates,ordifferent lev-elsofbaselineimmunity.Nursesforexamplemaybefrequently exposed andthereforehighlyimmune. Additionally,time-series analysisofoutbreakincidence(Lopmanetal.,2009)hassuggested thattransmissionratesgenerallymayvarywithhost,weather,and virusfactors.Takentogether,thesedifferencesmayexplain the largediscrepancyinestimatedtransmissionrates.Thecompilation andanalysisofalarge,multiple-outbreakdatasetthat includes predictivevariablesindicativeofhygiene,contactrates,and base-lineimmunitycouldshedlightonwhichoftheseelementshasthe greatesteffectontransmissionrates.

Inourapplication,wemadethesimplifyingassumptionthatthe latentperiodwasfixedatitsmean,whichallowedustodirectly calculateinfectiontimesfromthereportedonset ofsymptoms. Theinfectiontimesdeterminethecumulativeexposureshi,jin(3).

Becausethecumulativeexposureisanon-linearfunctionoftime andthemeanofanon-linearfunctionofarandomvariabledoes notalwaysequalthefunctionevaluatedattherandomvariable’s mean,theextenttowhich latentperiodsvariedinrealitylikely introducedbiasintoourcalculatedcumulativeexposuresandthe estimatesbasedonthem.Thebiascouldbeeitherpositiveor nega-tivedependingonwhethercumulativeexposureusuallyincreases morequicklybeforeoraftertheassumedtransmissiontimes.

Another simplifying assumption we made was that people wereonlyinfectiouswhentheyweresymptomatic.Insupportof this assumption,Sukhrieet al.(2012) haveshown that asymp-tomaticpeoplearemuchlessinfectiousthansymptomaticpeople in health-care settings. To quantify the effect of this and the fixed latent period assumption, we could make the infectious period a latent variablethat we integrate over toevaluate the

(12)

likelihood,asinHohleetal.(2005).Alternatively,wecoulduse a kernel-smoothing method to estimate unobserved latent or infectiousperiods,asinLauandYip(2008).However,even with-outsuchcalculationsit isclearthat if,in reality,theinfectious periodextendsbeyondthesymptomaticperiod,ourestimatesof transmissionrateshavebeeninflatedbyourunderestimationof exposure.

Fromthenumerical resultsdisplayed inTables 2 and 3, we seethat highly reliableestimation depends on collectionof an extensivedataset.TheCenters forDiseaseControland Preven-tion(CDC)hasrecentlyestablishedaNationalOutbreakReporting Systemthat,withthecontributionsofstatehealthdepartments, will provide more comprehensive surveillance for all U.S. gas-troenteritisoutbreaks(CentersforDiseaseControlandPrevention, 2011).However,thedatawehaveanalyzedhereismoredetailed thanwhatisroutinelycollectedinoutbreakinvestigations.More detailedoutbreakinvestigationsareneededtocollectsuchdata and further characterize modes of transmission. The collection ofNoV genomic datamayalsobe ofgreat value (Teuniset al., 2013).

Regardingthegeneralsubjectofoutbreaksofhospital-acquired infectionsinindividualhospitals,wesuspectthatinsome hospi-talsalargepartofthenecessarydatacollectionisalreadytaking placeasapartofexistingsurveillanceprograms.A2008–2009 sur-veyestimatedthatapproximatelyonethirdofCaliforniahospitals used automated surveillance technology to monitor hospital-acquired infections (Halpin et al., 2011). Similarly, a survey of hospitals in the Northeastern United States found that one thirdofhospitalshadanelectronicsurveillancesysteminplace (Grotaetal.,2010).Suchsystemswereusedtodetectoutbreaks, analyze data, and generate reports of hospital-acquired infec-tions(Grota et al., 2010). As several states mandate reporting infectionrates of MRSA and many more require someform of reportingof hospital-acquiredinfections (CommitteetoReduce Infection Deaths,2011), many hospitals may have data onthe totalnumberofcasesinmanyoutbreaksforseveralpathogensof concern.

Our analysis,based ona robustdata set,demonstrates that parameterestimates are substantially less biasedwhen a large numberofoutbreaksarefitted.Wesubmitthat,fornorovirusand manyotherpathogens,thereareseveralusesforaccurateestimates oftransmissionrates,initialgrowthrates,andinfectiousperiods. Policy-makerscanusesuchestimatestocomparetheefficacyof differentcontrol strategies suchas hygiene protocols, isolation measures,prophylactictreatments,andvaccinationpolicies.Those monitoringthesmalloutbreaksofzoonoticdiseasesmaybeable tousesuchestimatestoidentifyvariablesthatmaketransmission morelikely.

Roleofthefundingsource

ThisworkwassupportedbyNSFGrant EF-0742373and NIH GrantR01GM088344.E.B.O.wasalsosupportedbyBEACONGrant 61-2075UT.K.M.P. wasalso supportedbytheRAPIDD program of the Scienceand Technology Directorate,U.S. Department of HomelandSecurity,andtheFogartyInternationalCenter,NIH.Our fundingsourceshadnoroleinthisstudy’sdesign;inthecollection, analysis,and interpretationoftheanalyzeddata;inthewriting ofthispaper;orinthedecisiontosubmitthispaperfor publica-tion.

Acknowledgments

TheTexasAdvancedComputingCenteratUTprovided comput-ingresources.

AppendixA. Supplementarydata

Supplementary data associated with this article can be found,intheonlineversion,athttp://dx.doi.org/10.1016/j.epidem. 2013.12.002.

References

Anderson,R.M.,May,R.M.,1992.InfectiousDiseasesofHumans:Dynamicsand

Control.OxfordUniversityPress,USA.

Becker,N.,1979.Anestimationprocedureforhouseholddiseasedata.Biometrika

66,271–277.

Becker,N.G.,1989.AnalysisofInfectiousDiseaseData.ChapmanandHall/CRC.

Becker,N.G.,1991.Analysisofinfectiousdiseasedatafromasampleofhouseholds.

LectureNotes-MonographSeries18,27–40.

Bolker,B.,Skaug,H.,2011.R2admb:ADMBtoRinterfacefunctions.Rpackageversion 0.7.5.1.

Cannon,J.L.,Lindesmith,L.C.,Donaldson,E.F.,Saxe,L.,Baric,R.S.,Vinjé,J.,2009.Herd

immunitytoGII.4norovirusesissupportedbyoutbreakpatientsera.Journalof

Virology83,5363–5374.

CentersforDiseaseControlandPrevention,2008.Norovirusoutbreakinan

elemen-taryschool-DistrictofColumbia,February2007.MMWR56,1340–1343.

CentersforDiseaseControlandPrevention,2011.Updatednorovirusoutbreak

man-agementanddiseasepreventionguidelines.MMWR60,1–18.

Committee to Reduce Infection Deaths, 2011. State laws on reporting HAIs.

http://www.hospitalinfection.org/legislation.shtml

Ding,J.L.,Chen,X.R.,2006a.Asymptoticpropertiesofthemaximumlikelihood

esti-mateingeneralizedlinearmodelswithstochasticregressors.ActaMathematica

Sinica,EnglishSeries22,1679–1686.

Ding,J.L.,Chen,X.R.,2006b.Large-sampletheoryforgeneralizedlinearmodelswith

non-naturallinkandrandomvariates.ActaMathematicaeApplicataeSinica,

EnglishSeries22,115–126.

Evans,M.R.,Meldrum,R.,Lane,W.,Gardner,D.,Ribeiro,C.D.,Gallimore,C.I., West-moreland,D.,2002.Anoutbreakofviralgastroenteritisfollowingenvironmental

contaminationataconcerthall.EpidemiologyandInfection129,355–360.

Fahrmeir,L.,1985.Consistencyandasymptoticnormalityofthemaximum

like-lihoodestimatoringeneralizedlinear models.TheAnnalsofStatistics13,

342–368.

Forrester,M.,Pettitt,A.N.,2005.Useofstochasticepidemicmodelingto

quan-tifytransmissionratesofcolonizationwithMethicillin-ResistantStaphylococcus

aureusinanintensivecareunit.InfectionControlandHospitalEpidemiology26, 598–606.

Fournier,D.A.,Skaug,H.J.,Ancheta,J.,Ianelli,J.,Magnusson,A.,Maunder,M.N., Nielsen,A.,Sibert,J.,2011.ADModelBuilder:usingautomaticdifferentiationfor

statisticalinferenceofhighlyparameterizedcomplexnonlinearmodels.

Opti-mizationMethodsandSoftware27,233–249.

Glass,R.I.,Parashar,U.D.,Estes,M.K.,2009.Norovirusgastroenteritis.NewEngland

JournalofMedicine361,1776–1785.

Grota,P.G.,Stone,P.W.,Jordan,S.,Pogorzelska,M.,Larson,E.,2010.Electronic

surveillancesystemsininfectionprevention:organizationalsupport,program

characteristics,andusersatisfaction.AmericanJournalofInfectionControl38,

509–514.

Halloran,M.E.,Longini,I.M.,Struchiner,C.J.,2009.DesignandAnalysisofVaccine

Studies.Springer.

Halpin,H.,Shortell,S.M.,Milstein,A.,Vanneman,M.,2011.Hospitaladoptionof

auto-matedsurveillancetechnologyandtheimplementationofinfectionprevention

andcontrolprograms.AmericanJournalofInfectionControl39,270–276.

Hayakawa,Y.,O’Neill,P.D.,Upton,D.,Yip,P.S.,2003.Bayesianinferencefora

stochas-ticepidemicmodelwithuncertainnumbersofsusceptiblesofseveraltypes.

AustralianandNewZealandJournalofStatistics45,491–502.

Heijne,J.C.M.,Rondy,M.,Verhoef,L.,Wallinga,J.,Kretzschmar,M.,Low,N., Koop-mans,M.,Teunis,P.F.M.,2012.Quantifyingtransmissionofnorovirusduringan

outbreak.Epidemiology23,277–284.

Heijne,J.C.M.,Teunis,P.,Morroy,G.,Wijkmans,C.,Oostveen,S.,Duizer,E., Kretz-schmar,M.,Wallinga, J.,2009.Enhancedhygiene measuresandnorovirus

transmissionduringanoutbreak.EmergingInfectiousDiseases15,24–30.

Höhle,M.,2009.Additive-multiplicativeregressionmodelsforspatio-temporal

epi-demics.BiometricalJournal51,961–978.

Hohle,M.,Jorgensen,E.,O’Neill,P.D.,2005.Inferenceindiseasetransmission

exper-imentsbyusingstochasticepidemicmodels.JournaloftheRoyalStatistical

Society:SeriesC(AppliedStatistics)54,349–366.

Huggins,R.M.,Yip,P.S.F.,Lau,E.H.Y.,2004.Anoteontheestimationoftheinitial

numberofsusceptibleindividualsinthegeneralepidemicmodel.Statistics&

ProbabilityLetters67,321–330.

Isakbaeva,E.T.,Widdowson,M.A.,Beard,R.S.,Bulens,S.N.,Mullins,J.,Monroe,S.S., Bresee,J.,Sassano,P.,Cramer,E.H.,Glass,R.I.,2005.Norovirustransmissionon

cruiseship.EmergingInfectiousDiseases11,154–158.

Kalbfleisch,J.D.,Prentice,R.L.,2002.TheStatisticalAnalysisofFailureTimeData,

2ndedition.Wiley.

Keeling,M.J.,2005.Modelsoffoot-and-mouthdisease.Proceedings.Biological

Sci-ences272,1195–1202.

Kuusi,M.,Nuorti,J.P.,Maunula,L.,MinhTran,N.N.,Ratia,M.,Karlsson,J.,von Bonsdorff,C.H.,2002.AprolongedoutbreakofNorwalk-likecalicivirus(NLV)

(13)

gastroenteritisinarehabilitationcentreduetoenvironmentalcontamination.

EpidemiologyandInfection129,133–138.

Kypraios,T.,2009.Anoteonmaximumlikelihoodestimationoftheinitialnumber

ofsusceptiblesinthegeneralstochasticepidemicmodel.Statistics&Probability

Letters79,1972–1976.

Lau,E.H.Y.,Yip,P.S.F.,2008.Estimatingthebasicreproductivenumberinthegeneral

epidemicmodelwithanunknowninitialnumberofsusceptibleindividuals.

ScandinavianJournalofStatistics35,650–663.

L’Ecuyer,P.,Simard,R.,Chen,E.,Kelton,W.,2002.Anobject-oriented

random-numberpackagewithmanylongstreamsandsubstreams.OperationsResearch

50,1073–1075.

Liang,K.Y.,Zeger,S.L.,1986.Longitudinaldataanalysisusinggeneralizedlinear

models.Biometrika73,13–22.

Lively,C.M.,2010.Anepidemiologicalmodelofhost–parasitecoevolutionandsex.

JournalofEvolutionaryBiology23,1490–1497.

Lopman,B.,Armstrong,B.,Atchison,C.,Gray,J.J.,2009.Host,weatherand

viro-logicalfactorsdrivenorovirusepidemiology:time-seriesanalysisoflaboratory

surveillancedatainEnglandandWales.PLoSOne4,e6671.

Lopman,B.A.,Hall,A.J.,Curns,A.T.,Parashar,U.D.,2011.Increasingratesof

gas-troenteritishospitaldischargesinUSadultsandthecontributionofnorovirus,

1996–2007.ClinicalInfectiousDiseases52,466–474.

Lopman,B.A.,Reacher,M.H.,Vipond,I.B.,Hill,D.,Perry,C.,Halladay,T.,Brown,D.W., Edmunds,W.J.,Sarangi,J.,2004a.Epidemiologyandcostofnosocomial

gastroen-teritis,Avon,England,2002–2003.EmergingInfectiousDiseases10,1827–1834.

Lopman,B.A.,Reacher,M.H.,Vipond,I.B.,Sarangi,J.,Brown,D.W.G.,2004b.

Clin-icalmanifestationofnorovirusgastroenteritisinhealthcaresettings.Clinical

InfectiousDiseases39,318–324.

Meyer,S.,Elias,J.,Höhle,M.,2012.Aspace-timeconditionalintensitymodelfor

invasivemeningococcaldiseaseoccurrence.Biometrics68,607–616.

O’Neill,P.D.,Marks,P.J.,2005.Bayesianmodelchoiceandinfectionroutemodelling

inanoutbreakofNorovirus.StatisticsinMedicine24,2011–2024.

RDevelopmentCoreTeam,2010.R:ALanguageandEnvironmentforStatistical

Computing.RFoundationforStatisticalComputing,Vienna,Austria,ISBN

3-900051-07-0.

Rida,W.N.,1991.Asymptoticpropertiesofsomeestimatorsfortheinfectionratein

thegeneralstochasticepidemicmodel.JournaloftheRoyalStatisticalSociety.

SeriesB(Methodological)53,269–283.

Rosenthal,N.A.,Lee,L.E.,Vermeulen,B.A.J.,Hedberg,K.,Keene,W.E.,Widdowson, M.,Cieslak,P.R.,Vinjé,J.,2011.Epidemiologicalandgeneticcharacteristicsof

norovirusoutbreaksinlong-termcarefacilities,2003–2006.Epidemiologyand

Infection139,286–294.

Scallan,E.,Hoekstra,R.M.,Angulo,F.J.,Tauxe,R.V.,Widdowson,M.,Roy,S.L.,Jones, J.L.,Griffin,P.M.,2011.FoodborneillnessacquiredintheUnitedStates—major

pathogens.EmergingInfectiousDiseases17,7–15.

Shao, Q.X., 1999. Some properties of an estimator for the basic

reproduc-tionnumberofthegeneralepidemicmodel.MathematicalBiosciences159,

79–96.

Sukhrie,F.H.A.,Teunis,P.,Vennema,H.,Copra,C.,ThijsBeersma,M.F.C.,Bogerman, J.,Koopmans,M.,2012.Nosocomialtransmissionofnorovirusismainlycaused

bysymptomaticcases.ClinicalInfectiousDiseases54,931–937.

Teunis,P.,Heijne,J.C.M.,Sukhrie,F.,vanEijkeren,J.,Koopmans,M.,Kretzschmar, M.,2013.Infectiousdiseasetransmissionasaforensicproblem:whoinfected

whom?JournaloftheRoyalSocietyInterface10,20120955.

Thornley,C.N.,Emslie,N.A.,Sprott,T.W.,Greening,G.E.,Rapana,J.P.,2011.

Recur-ringnorovirustransmissiononanairplane.ClinicalInfectiousDiseases53,

515–520.

Vignaux, T., Muller, K., Helmbold, B., 2012. SimPy Manual, Available at

http://simpy.sourceforge.net

Venable,W.N.,Ripley,B.D.,2002.StatisticsComplementstoModernApplied Statis-ticswithS,4thedition,Availableathttp://www.stats.ox.ac.uk/pub/MASS4/

Voirin,N.,Roche,S.,Vanhems,P.,Giard,M.,David-Tchouda,S.,Barret,B.,Ecochard, R.,2011.Amultiplicativehazardregressionmodeltoassesstheriskofdisease

transmissionathospitalduringcommunityepidemics.BMCMedicalResearch

Methodology11,53.

Wickham,H.,2009.ggplot2:ElegantGraphicsforDataAnalysis.Springer,NewYork.

Wikswo,M.E.,Cortes,J.,Hall,A.J.,Vaughan,G.,Howard,C.,Gregoricus,N.,Cramer, E.H.,2011.Diseasetransmissionandpassengerbehaviorsduringahigh

morbid-ityNorovirusoutbreakonacruiseship,January2009.ClinicalInfectiousDiseases

52,1116–1122.

Wilkinson,G.N.,Rogers,C.E.,1973.Symbolicdescriptionoffactorialmodelsfor

analysisofvariance.AppliedStatistics22,392–399.

Yen,C.,Wikswo,M.E.,Lopman,B.A.,Vinje,J.,Parashar,U.D.,Hall,A.J.,2011.Impact

ofanemergentnorovirusvariantin2009onnorovirusoutbreakactivityinthe

UnitedStates.ClinicalInfectiousDiseases53,568–571.

Zelner,J.L.,King,A.A.,Moe,C.L.,Eisenberg,J.N.S.,2010.Howinfectionspropagate

afterpoint-sourceoutbreaks:ananalysisofsecondarynorovirustransmission.

Epidemiology21,711–718.

Zelner,J.L.,Lopman,B.A.,Hall,A.J.,Ballesteros,S.,Grenfell,B.T.,2013.Linking

time-varyingsymptomatologyandintensityofinfectiousnesstopatternsofnorovirus

Figure

Fig. 1. Estimates versus number of outbreaks. The row names indicate parameters.
Fig. 5. Quantile–quantile plots of the cumulative exposure of people at the time of infection
Fig. 6. Quantile–quantile plots of the symptomatic periods. The semi-transparent lines connect data points from the same outbreak

References

Related documents