Contents lists available atScienceDirect
Epidemics
j o u r n a l h o m e p a g e :w w w . e l s e v i e r . c o m / l o c a t e / e p i d e m i c s
Using
phenomenological
models
for
forecasting
the
2015
Ebola
challenge
Bruce
Pell
a,d,∗,
Yang
Kuang
a,
Cecile
Viboud
c,
Gerardo
Chowell
b,caSchoolofMathematicalandStatisticalSciences,ArizonaStateUniversity,AZ,USA bSchoolofPublicHealth,GeorgiaStateUniversity,Atlanta,GA,USA
cDivisionofInternationalEpidemiologyandPopulationStudies,FogartyInternationalCenter,NationalInstitutesofHealth,Bethesda,MD,USA dDepartmentofMathematics,Statistics,andComputerScience,St.OlafCollege,MN,USA
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received30June2016
Receivedinrevisedform1November2016 Accepted15November2016
Availableonlinexxx
Keywords:
Logisticgrowthmodel Richardsmodel
GeneralizedRichardsmodel Ebolachallenge
a
b
s
t
r
a
c
t
Background:Therisingnumberofnovelpathogensthreateningthehumanpopulationhasmotivatedthe applicationofmathematicalmodelingforforecastingthetrajectoryandsizeofepidemics.
Materialsandmethods:Wesummarizethereal-timeforecastingresultsofthelogisticequationduring the2015Ebolachallengefocusedonpredictingsyntheticdataderivedfromadetailedindividual-based modelofEbolatransmissiondynamicsandcontrol.Wealsocarryoutapost-challengecomparisonoftwo simplephenomenologicalmodels.Inparticular,wesystematicallycomparethelogisticgrowthmodeland arecentlyintroducedgeneralizedRichardsmodel(GRM)thatcapturesarangeofearlyepidemicgrowth profilesrangingfromsub-exponentialtoexponentialgrowth.Specifically,weassesstheperformance ofeachmodelforestimatingthereproductionnumber,generateshort-termforecastsoftheepidemic trajectory,andpredictthefinalepidemicsize.
Results:Duringthechallengethelogisticequationconsistentlyunderestimatedthefinalepidemicsize, peaktimingandthenumberofcasesatpeaktimingwithanaveragemeanabsolutepercentageerror (MAPE)of0.49,0.36and0.40,respectively.Post-challenge,theGRMwhichhastheflexibilitytoreproduce arangeofepidemicgrowthprofilesrangingfromearlysub-exponentialtoexponentialgrowthdynamics outperformedthelogisticgrowthmodelinascertainingthefinalepidemicsizeasmoreincidencedata wasmadeavailable,whilethelogisticmodelunderestimatedthefinalepidemicevenwithanincreasing amountofdataoftheevolvingepidemic.IncidenceforecastsprovidedbythegeneralizedRichardsmodel performedbetteracrossallscenariosandtimepointsthanthelogisticgrowthmodelwithmeanRMS decreasingfrom78.00(logistic)to60.80(GRM).Bothmodelsprovidedreasonablepredictionsofthe effectivereproductionnumber,buttheGRMslightlyoutperformedthelogisticgrowthmodelwitha MAPEof0.08comparedto0.10,averagedacrossallscenariosandtimepoints.
Conclusions:Ourfindingsfurthersupporttheconsiderationoftransmissionmodelsthatincorporate flex-ibleearlyepidemicgrowthprofilesintheforecastingtoolkit.Suchmodelsareparticularlyusefulfor quicklyevaluatingadevelopinginfectiousdiseaseoutbreakusingonlycaseincidencetimeseriesofthe earlyphaseofaninfectiousdiseaseoutbreak.
©2016TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction
The rising number of novel pathogens with transmission potential threatening the human population has motivated thedevelopmentof mathematical and computationalmodeling approachesforforecastingepidemicimpact(Colizzaetal.,2006; Balcan et al., 2009; Merler et al., 2015; Chretien et al., 2015). While epidemic models of disease spread have been used for
∗ Correspondingauthorat:ArizonaStateUniversity,Tempe,AZ,USA. E-mailaddresses:[email protected],[email protected](B.Pell).
decadesprimarilywiththegoalofgaininginsightintothe transmis-siondynamicsandpotentialeffectofdifferentcontrolstrategies, researchershaveonlyrecentlystartedtoharnessavailable com-putationalpowertosimulate,calibrate,andgenerateforecastsof epidemicspreadusingavarietyofepidemicmodelsrangingfrom classiccompartmentalmodelstodetailedagent-basedmodels.Yet, besidessignificantincreasesincomputationalpower,detailed epi-demicdataaboutthetransmissioncharacteristicsandtheoretical advancesareneededinordertomorerealisticallyaccountfor trans-missionandcontrolmechanismsfordifferentdiseaseandsocial contexts.
http://dx.doi.org/10.1016/j.epidem.2016.11.002
1755-4365/©2016TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.
Becauseepidemicsassociatedwithinfectiousdiseasesofrapid disseminationtypicallycompriseonlyafewdiseasegenerations oftransmission,epidemicassessmentusingforecastingmodelsis crucialduring theearlyepidemicgrowthphase inorder assess thepotentialdiseaseburdenposedby theinfectiousagent and approximatethescaleofinterventionsneededtoachieveepidemic containment.Unfortunately,theavailabilityofdetailed epidemi-ologicaldataparticularlyduringtheearlyepidemicstagesofan evolvingepidemicoutbreakishinderedbydelaysindetectingthe firsttransmissioneventsor releasingdata tothepublic, orthe particularcharacteristicsofthesurveillancesystem.Forinstance, duringthe2014–15EbolaepidemicinWestAfrica,publicly avail-able epidemiological data from theWorld Health Organization (WHO)wasnotavailableduringthefirstweeksduringwhichthe viruswastogainasolidfootholdinpopulationsofGuinea,Liberia and Sierra Leone.Moreover,data waslargely limited to aggre-gatedweeklyEbolacase countsatthecountrylevel,whichwas theprimarypubliclyavailabledatasetdocumentingtheEbola epi-demicinWestAfrica.Casecountdataatthesubnationallevel(e.g. county/districtlevels) thatlater becomeavailable revealed sub-stantialspatialheterogeneityintransmissionpatternsacrossthe affectedareasinWestAfrica,whichcouldhaveinfluencedepidemic forecastsandassessmentsofthetransmissionpotential(Chowell etal.,2015).
Inthisarticlewesummarizetheforecastingresultsfromusing thelogisticequationtoforecastthe2015Ebola challenge.After summarizing these results, we present the results of a post-challengesystematiccomparativeanalysisofthelogisticgrowth model,whichassumesanearlyexponentialgrowthphase(chowell and Viboud, 2016), and thegeneralized Richards model (GRM) (Chowell et al., 2016a), which incorporates a flexible range of earlyepidemicgrowthprofilesincludingearlysub-exponentialand exponentialgrowthepidemics.Wecomparetheperformanceof thesemodelsinthecontextofthe2015Ebolachallengebasedon syntheticdataderivedfromadetailedindividual-basedmodelof Ebolatransmission.Specifically,weanalyzethereproduction num-ber,forecastsoftheepidemictrajectory andthefinal epidemic size.Inadditiontomodelcomparison,wecomparetwouncertainty methodsofthebestfitsolutionstothesyntheticdata.
2. Materialsandmethods 2.1. Modeldescription
The well-known logistic growth model was previously employed for epidemic forecasting the 2015 Ebola epidemic (Chowelletal.,2014),andwasthemodeloriginallyemployedby theArizonaStateTeam(BP&YK)duringthe2015EbolaChallenge. Thissimplemodelisgivenbythefollowingdifferentialequation: C=rC
1−KC
(1)whereC’(t)modelstherateofchangeinthenumberofnewcases atweekt.Thelogisticmodelreliesontwoparameters,theintrinsic infectionrate,r,andthefinalepidemicsizeK.
Forcomparativepurposes,wealsoanalyzedtheperformance of the recently introduced generalized Richards model (GRM) (Chowelletal.,2016a),whichhasbeenrecentlydevisedinorderto capturethepossibilityofearlysub-exponentialgrowthepidemics andisgivenby:
C=rCp
1−
KCa
(2) TheGRMisanenhancedversionoftheRichardsmodel(Wang etal.,2012)byintegratingthegeneralized-growthmodel(GGM; C=rCp(t))(Viboudetal.,2016).Specifically,theGRMincorporates
adecelerationofgrowthparameterp tomodelarange ofearly epidemicgrowthprofilesrangingfromconstantincidence(p=0), polynomial(0<p<1)andexponentialgrowthdynamics(p=1).The GRMmodelwasrecentlyemployedtogenerateforecastsofthe ZikaepidemicinAntioquia,Colombia(Chowelletal.,2016a).All parametervaluesarepositive:risthegrowthrate,Kisthefinal epidemicsize,andaisaparameterthatmodulatesthepeaktiming. 2.2. Data
The Research and Policy for Infectious Disease Dynamics (RAPIDD)EbolaChallengewasdesignedtotesttheforecasting abil-ityofmathematicalmodelsduringanepidemicinreal-time(Ebola Challengewebsite,2016).Thechallengewasmotivatedbytheneed todevelopandtestanensembleofmathematicalmodelsforuse inforecastingdevelopinginfectiousdiseaseepidemicsandto fos-tercollaborationsacrossdifferentscientificdomains.Goalsofthe contestincluded:
1.Improvingpredictivecapabilitiesforfutureemergencies 2.Guidingtheimplementationofcontrolmeasures
3.Illustratinghowdataqualityandavailabilityaffectprediction accuracy
Inthisspirit,syntheticepidemicdatawasgeneratedbya mod-ified version of the model publishedby Merler et al.that was calibratedforanEVDoutbreakinLiberia(Merleretal.,2015). Syn-theticepidemicdatawasreleasedatfivedifferenttimepointswith atestreleaseonSept.18,2015.Modelpredictionswereduetwo weekslateraftereachtimepoint.Formodelcalibration,weonly usedthecountrylevelincidencetimeseriesdataforpredictions.
Contained in each of the five batches of released data, fourscenariosrepresentingdifferentepidemiologicalconditions, behavioral changes,intervention measuresand dataavailability werepreparedforuseinforecastingtheepidemic(chowelland Viboud,2016).Inaddition,eachscenariodatasetcontained out-breaksituationreports,transmissiontreedataandweeklyreported newEVDcasesatthecountyandcountrylevel.NewEVDcaseswere forecastedatone,two,threeandfourweekspasteachtimepoint, seeFig.S1.
2.3. Thegenerationtime
Thegenerationtime isdefined asthetime elapsedbetween infectioninanindexcasepatientandinfectioninapatientinfected bythatindexcase(Chowelletal.,2006).Weusedtransmission treedata(Ebola Challengewebsite,2016)thatwasmade avail-ableaspartofthechallengeforscenarios1,3and4toderivetheir generationtimedistributions,respectively.Forscenario2weused estimationsfromscenario1.
2.4. Theeffectivereproductionnumber
The effective reproduction number, Re(t), is defined as the
averagenumberofnewinfectionsgeneratedbyoneinfectious indi-vidualinthepopulationattimet (NishiuraandChowell,2009). Re(t) wasnumerically evaluatedby training each model onan
increasingamountofdata(Chowelletal.,2016a,b)usingthe dis-cretizedrenewalequation(NishiuraandChowell,2009;Chowell etal.,2016b;Fraser,2007): Re(ti)= Ii
i j=0Ii−jj (3) where Ii denotes incidence at time ti, j denotes theassumedtobegammadistributedwithameanof16days(Team WHOER,2014)andthedenominatorrepresentsthetotalnumberof casesthatcontribute(asprimarycases)togeneratingnewcasesIi
(assecondarycases)(NishiuraandChowell,2009).Weestimatethe effectivereproductionnumberusingthe2-stepapproachdescribed inChowelletal.(2016b).Instep1)weusenonlinearleastsquares tofitthephenomenologicalmodeltothesyntheticdatainorder toestimatethemodelparameters.TheinitialnumberofcasesC0
isfixedaccordingtothefirstobservationinthedata.Nominal95% confidenceintervalsforparameterestimatesaregeneratedby sim-ulating200best-fitcurvesC(t)usingparametricbootstrapwitha Poissonerrorstructure,asinpriorstudies(Chowelletal.,2006).In step2),weemploytheuncertaintyinmodelfitgeneratedinstep1 andapplyEq.(7)totheeachofthecurvescomprisingtheensemble uncertaintytimeseriesdata.
2.5. Performancemetricsandepidemiologicalforecastingtargets Allteamsthatparticipatedinthechallengehadtheirmodels assessed accordingtoa predefined set ofperformance metrics, which wereused tosystematicallycompare forecasting perfor-manceacrosstheparticipatingmodels.Allmetricswerecalculated usingmodelpredictedincidencesandobservedincidences (syn-theticincidence data). Performancemetrics included:Pearson’s correlationcoefficient,meansquareerror(MSE),rootmeansquare error(RMSE),meanabsoluteerror(MAE)andthemeanabsolute percentageerror(MAPE).Inadditiontothese,thecoefficientof determination,R2,wascalculatedusingtheformulaR2=1−SSR
SST,
whereSSRisthesumofsquaredresidualsandSSTisthetotalsum ofsquares.
Incidencetargetsconsistedofincidencepredictions(newEVD cases)at1,2,3and4weeksafterthelastobservedtimepointfora givenscenario.Thechallengeassessedeachteam’smodel perfor-mancebycomparingincidencetargetsandnonincidencetargets usingthemetricsabove.Nonincidencetargetsconsistedof effec-tivereproductionnumber,peaktime,incidenceatpeaktimeand finalepidemicsize.
2.6. Uncertaintymethod1
DuringtheEbolachallenge,incidencetargets,effective repro-ductionnumber,finalepidemicsizeandpeaktimingpredictions weregenerated byemploying MATLAB’s (TheMathworks, Inc.) built-in function, LSQCURVEFIT, with theLevenberg-Marquardt optiontofindoptimizedparametervaluesforthebestfitsolution ofthelogisticmodeltothecumulativereportedEVDcases(Moré, 1978;Marquardt,1963).
Duringthechallengeweconsistentlyemployedaresidual boot-strapping method to obtain the 25th and 75th percentiles for parameterestimatesthatisdescribedinPelletal.(2016).Inshort, wefitthemodelonceandrandomlyaddedtheresidualsbackinto theoriginalincidencedatatocreateanewdataset.Anew opti-mizedparametersetwasthenobtainedbyfittingthelogisticmodel tothisnewdatasetandthentheprocesswasrepeated2000times. 2.7. Uncertaintymethod2
For model comparison, Eqs. (1) and (2) were fitted to the reported incidence data using the built-in MATLAB function LSQCURVEFIT(TheMathworks,Inc.).Withthismethod,confidence intervalsfor model parameters and epidemiological forecasting targetswereconstructedasinpriorstudies(Chowelletal.,2016a, 2006)by simulating200realizationsof thebest-fitcurveusing parametric bootstrap with a Poisson error structure. The 95% confidenceintervalswerecalculated bytakingthe2.5and 97.5 percentilesfromthegeneratedparameterdistributions.
Incidenceforecastestimationsweregeneratedbyextendingthe 200realizationsofthebest-fittrajectoryofamodel4weeksintothe futureaftertheforecastingtimepoint.The95%confidencebands fortheincidencetargetswereconstructedwiththedistributions ofincidencepredictionsateachtimepoint.
3. Results
3.1. Challengeresults
Duringthechallengethelogisticequationcoupledwith Uncer-taintyMethod1 consistentlyunderestimatedthefinal epidemic size,peaktimingandthenumberofcasesatpeaktimingwithan averageMAPEof0.49,0.36and0.40respectively.Fig.S3ofthe Sup-plementalmaterialillustratestheunderestimationsofthesekey quantitiesacrossallscenariosandtimepoints.Estimationsofthe effectivereproductionnumbershowedsimilarbehaviorwithan averageMAPEacrossallscenariosof0.22(Fig.S4;Supplemental material).Incontrast,averagesofPearson’sRwere0.72,0.58,0.55 and−0.24forscenarios1,2,3,and4respectively.Thisindicatesthat modelincidenceforecastingtrajectoriesforscenarios1–3 approxi-matelyfollowedthesametrendasthedata.Although,threeofthese averagesofPearson’sRarepositive,thereisroomfor improve-mentsothat theyarecloserto1 (aPearson’sRscore closerto 1meansbetteragreementwiththetrendoftheincidencedata). Fig.S5oftheSupplementalmaterialpresentsepidemic forecast-ingplotsthatweregeneratedduringthechallenge.Inthisfigure,a comparisonbetweenfittedmodeltrajectoriestocumulativecases arepresentedsidebysidewiththeresultingincidencepredictions. Wewouldliketopointouttheconsistentunderestimationofthe newcasesandcumulativecases.
Becauseofamisunderstandingduringthechallenge, estima-tionsweresubmittedforthebasicreproductionnumberinsteadof theeffectivereproductionnumber.Consequently,thisledto incor-rectpredictionsoftheeffectivereproductionnumberduringthe challenge.HereweprovidecorrectedresultsusingEq.(3)(Fig.S4, leftcolumn)andsummarizetherestofthepredictionsmadeby thelogisticmodelduringthechallengeinTableS1andFig.S3(left column).
3.2. Motivationforpost-challengemodeliteration
Theresultsdiscussedabovesuggestthatourmodelingmethod canbenefitfromchangesintwoparticularareas:modelselection anduncertaintyestimation.Inparticular,theunderestimationof thefinalepidemicsize,peaktimingandthenumberofcasesatpeak timingmotivatetheuseofamorerealisticmodelofuncertainty thatwillcapturemoreoftheuncertaintyinthemodelbestfitand thereforeallowforbroaderandmorerealisticconfidenceintervals ofthesekeyquantities.Inadditiontothis,theincidence trajecto-riesfromthelogisticequationarealwayssymmetric,something thatisnotnecessarilytrueforreal-worldepidemicdata.Hence,we decidedtoemployamodelthatincorporatestwoadditional param-etersinthelogisticequationthatmodulatethefinalepidemicsize and theinitialepidemic growthprofile(Chowellet al., 2016a). WecomparedourresultswiththoseobtainedusingUncertainty Method2andtheGRM.
3.3. Post-challengemodelanduncertaintymethodcomparison analysis
In contrast to the resultsduring the challenge, quantitative improvementswereseenwiththelogisticmodelbyusing Uncer-taintyMethod2(Fig.S4;Supplementmaterial).Forinstance,the meanMAPEacrossallscenariosandtimepointsoftheeffective reproductionnumberdecreasedto0.10withUncertaintyMethod
2.Similarly,acrossallscenarios,incidencetargetRMSdecreased from177.83to78.00usingUncertaintyMethod1andUncertainty Method2, respectively (see Table S1of Supplemental material andTable1ofmaintext).Performancestatisticsforthelogistic modelusingUncertaintyMethod1arereportedinTableS1ofthe Supplementmaterialandshouldbecomparedwithresultsfrom UncertaintyMethod2inTable1ofthemaintext.
Post-challengeincidenceforecastingperformancemetricsare summarizedinTable1andpost-challengeincidenceforecast tra-jectoriesareillustratedforallscenariosinFig.1.UsingUncertainty Method2,the GRMmodel provided improvedincidencetarget forecastscomparedtothelogisticmodelwhenthemodelswere calibratedonanincreasingsetofincidencedata.Inparticular,the GRMhadlowermeanRMSvaluesineveryscenariothanthelogistic model(seeTable1).Forexample,meanRMSdecreasedfrom66.80 (logistic)to48.39(GRM)inscenario2.Furthermore,theGRM per-formedbetteracrossallscenariosandtimepointsthanthelogistic model.Inparticular,RMSaveragedacrossallscenariosdecreased from78.00(logistic)to60.80(GRM)(Table1).Similar improve-mentswereseenwhentakingthemeanacrossallscenariosand timepointsforPearson’sRscoreandthemeanabsolute percent-ageerror;Pearson’sRscoreincreasedfrom0.15(logistic)to0.36 (GRM)andtheMAPEdecreasedfrom0.38(logistic)to0.32(GRM). TheGRMslightlyoutperformedthelogisticmodelinscenario1 withincidenceRMSdecreasingby1.01%whenaveragingacrossall timepoints(Table1).Additionally,theGRMhadbetteragreement withthetrendofincidencetargetswiththehigherPearsonRscore of0.55thanthelogisticmodel’s0.33(Table1).
Inscenario2,theGRMdisplayedbetterperformancethanthe logisticmodel with incidenceRMS decreasingby 27.56% when averagingacrossalltimepoints.Asinscenario1,theGRMshowed betteragreementwiththetrendofincidencetargetswithahigher PearsonRscore(GRM:0.51,logistic:0.47)(Table1).
Onceagain,theGRMdisplayedbetterperformanceinscenario 3thanthelogisticmodelwithincidenceRMSdecreasingby11.68%
whenaveragingacrossalltime points.The GRMshowedbetter agreementwiththeincidencetargetswithahigherPearsonRscore thanthelogistic(GRM:0.31,logistic:−0.10)(Table1).
Scenario4displayedthebiggestdifferenceinincidence fore-castingwiththe GRMoutperforming thelogistic model witha 32.36%decreaseinincidenceRMSwhenaveragingacrossalltime points.Again,theGRMshowedbetteragreementwiththe inci-dencetargetswithaPearsonRscoreof0.36comparetothelogistic model’s−0.08(Table1).
Wedidnotincludetimepoint1inouranalysisforfinalepidemic sizepredictions,becauseofaninsufficientamountofdataformodel calibrationthatdidnotconstrainestimationsofKinscenario4. Consideringtimepoints2–5andscenarios1–4,theoverall uncer-taintyinthepredictedepidemicsizewasreducedasmoredatawas madeavailableformodelcalibration,buttheGRMachievedbetter coverageoftheobservedfinalepidemicsizethanthelogistic.In particular,Fig.2showsthat95%confidencebarsoffinalepidemic sizepredictionsprovidedbytheGRMcontainedthetrueepidemic size8outof16times(50%successrate)andhadanaverageMAPEof 0.30acrossallscenarios.Incontrast,thelogisticmodelconsistently underestimatedthefinalepidemicsizeinallscenariosduringtime points2–5withanaverageMAPEof0.31acrossallscenariosand 95%confidencebarsthatnevercontainedtheepidemicsize,see Fig.2.
Estimationsofthegenerationintervalassumingagamma dis-tribution yielded reasonably good fits, with mean generation timesintherangeof11.9–17.1daysandvarianceintherangeof 8.3–42.3daysacrossscenarios1,3and4(Fig.S2).
Usingtheestimatedmeangenerationtimeandvariancesfrom transmission tree data from scenario 1, 3 and 4 to calculate theeffective reproduction number yielded overestimates. Most notablearetheestimationsbybothmodelsinscenario4,wherethe variancewasthelargestat23.7days.Acrossallscenarios,theGRM performedbetterthatthelogisticwithanMAPEof2.37,whilethe logisticmodelhadanMAPEvalueof2.64.Incontrast,estimatesof
Table1
IncidenceperformancestatisticsforthelogisticgrowthmodelandgeneralizedRichardsequation.
Logistic GeneralizedRichards
Pearson’sR RMS MAPE Pearson’sR RMS MAPE
Scenario1 Timepoint1 −0.13 38.43 0.32 0.65 45.17 0.40 Timepoint2 −0.79 105.02 0.28 0.95 80.95 0.22 Timepoint3 0.76 73.27 0.29 −0.65 140.61 0.56 Timepoint4 0.97 80.44 0.64 0.97 51.08 0.40 Timepoint5 0.84 31.21 0.78 0.85 7.24 0.16 Scenariomean 0.33 65.67 0.46 0.55 65.01 0.35 Scenario2 Timepoint1 0.98 17.16 0.11 0.98 34.62 0.24 Timepoint2 −0.83 107.98 0.46 −0.85 78.89 0.33 Timepoint3 0.58 143.75 0.49 0.81 28.14 0.09 Timepoint4 0.72 51.79 0.28 0.75 80.90 0.52 Timepoint5 0.91 13.37 0.44 0.90 19.41 0.64 Scenariomean 0.47 66.81 0.36 0.52 48.39 0.36 Scenario3 Timepoint1 −0.89 21.16 0.51 0.89 19.48 0.45 Timepoint2 −0.97 77.11 0.48 −0.97 62.15 0.41 Timepoint3 −0.46 76.43 0.28 0.38 17.86 0.05 Timepoint4 0.91 18.14 0.08 0.93 69.41 0.41 Timepoint5 0.87 10.38 0.23 0.88 10.56 0.24 Scenariomean −0.11 40.64 0.31 0.42 35.89 0.31 Scenario4 Timepoint1 0.98 61.12 0.48 0.98 31.34 0.28 Timepoint2 0.57 73.15 0.23 0.83 93.54 0.31 Timepoint3 −0.13 89.53 0.21 −0.13 73.05 0.17 Timepoint4 −0.94 171.94 0.40 −0.96 90.24 0.18 Timepoint5 −0.90 298.75 0.61 −0.87 181.53 0.37 ScenarioMean −0.08 138.90 0.39 −0.03 93.94 0.26
Fig.1.Epidemicforecastsbasedonthelogistic(Eq.(1);leftcolumn)andthegeneralizedRichardsmodel(Eq.(3);rightcolumn)calibratedonepidemicdataupfromall5 timepointsforscenarios1,2,3and4,respectively.Themean(solidblackline),95%CIpredictioncone(shadedgrayareawithdashedborder)forthecalibratedmodelof 200forecastingensemblesaredisplayedalongwiththesyntheticepidemicdata(redline).(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderis referredtothewebversionofthisarticle.)
theeffectivereproductionnumberprovidedreasonablepredictions undertheassumptionofagammadistributedgenerationinterval withameanof16daysandvarianceof8days,seeFig.3and Sup-plementalTableS2.Inparticular,theGRMagainoutperformedthe logisticmodelwithanMAPEof0.08comparedto0.10,averaged acrossallscenariosandtimepoints.
Meanestimates ofthedeceleration ofgrowthparameter(p) duringtheearlygrowthphasederivedbyfittingtheGGMtothe first6,8and10weeksoftheepidemicrangedfrom0.45-0.54, 0.5-0.74,0.38-0.5and0.37-0.51forscenarios1,2,3and4respectively (Fig.4).Therangesofpacrossallscenariossupportsub-exponential growthprofileswithsubstantialuncertainty.
4. Discussion
Togainabetterunderstandingoftheimpactofmodel assump-tionsduringreal-timeforecastingofepidemics,wehaveassessed theforecastingperformanceoftworelativelysimple phenomeno-logicalmodelsusingdatafromthe2015EbolaChallenge.During thecompetition, we employedthe logisticequation toprovide estimatesofepidemicsize,peaktimingandtheeffective
repro-ductionnumberusingUncertaintyMethod1.Thesimplicityofthis approachallowedustorapidlyprovideestimates,butproduced poorforecastingestimateswhencoupledwithUncertaintyMethod 1becauseitfailedtocapturetheuncertaintyassociatedwiththe bestfittodata.Ourretrospectiveanalysisindicatesthatimproved uncertaintymeasurescanbeobtainedusingparametricbootstrap withPoissonerrorstructure(UncertaintyMethod2)(Chowelletal., 2006).Wecompared theperformanceofthelogisticmodeland thegeneralizedRichards modelcalibratedwithvaryingamount ofepidemicdata.Bychangingthemethodusedtomodelerrorin thebestfittodata,weimprovedtheperformanceofthelogistic model’sabilitytoestimatetheeffectivereproductionnumber.This highlightsthesensitivitytheimpactofthecalibrationprocesson theabilityofthemodeltoestimatekeyquantities.Although,the logisticmodelcoupledwithUncertaintyMethod2wasan improve-ment,wesawanevenfurtherimprovementwhenusingtheGRM incorporatingflexibleearlyepidemicgrowthprofiles.Inparticular, GRMobtainedcloserfinalepidemicsizeestimationswithlessdata thanthelogistic.Finally,thelogisticequationandtheGRM pro-videdsimilarestimatesofthereproductionnumberandprovided reasonablyaccurateresultsgiventheirphenomenologicalnature.
Fig.2.ComparisonoffinalepidemicsizepredictionsderivedusingthelogisticgrowthandgeneralizedRichardsmodels.TheGeneralizedRichardsmodel(Eq.(2);right column)providedimprovedforecastsoverthelogisticmodel(Eq.(1);leftcolumn)oftheexpectedepidemicsizeusingdataoftheevolvingepidemicattimepoints1,2,3,4 and5.Themeanand95%confidencebounds(verticalbars)foreachpredictiontimepointareshown.Thereddashedhorizontallineshowstheactualepidemicsizeforeach scenario.(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
InclusionoftheparameterpintheGRMismotivatedby stud-iesthat have recentlyshown supportfor thepresence ofearly sub-exponentialgrowthdynamics(Chowelletal.,2016a;Viboud etal.,2016).Inparticularpreviousstudieshavehypothesizedthat sub-exponential growth patterns couldmanifest fromspatially constrainedcontactstructures,controlinterventionsand popula-tionbehaviorchanges(Chowelletal.,2016a;Viboudetal.,2016). Futureworkinthisdirectioncouldincludeananalysisofthe sensi-tivityofpwithrespecttospatiallyconstrainedcontactstructures. Inaddition,thelogisticmodelandGRMbothassumethatasmore casesaccumulate, thesusceptible populationisdepleted. How-ever,thisphenomenologicalsaturationeffectinthesemodelsonly becomesimportantduringthelater stagesof theepidemicand couldcapturebehaviorchanges,publichealthinterventionsand otherdiseasepreventionstrategiesthatmaytakeplaceduringan evolvingepidemic.
Ourmean “synthetic”estimates ofthereproduction number duringtheearlyepidemicgrowthphaseareinbroadagreement withpublishedestimatesofthereproductionnumberderivedfrom realEbolaepidemicsincludingforpastoutbreaksinCentralAfrica (Chowelletal.,2004;Legrandetal.,2007)andestimatesderived forthe2014-15EbolainWestAfrica(ChowellandNishiura,2014;
Althaus,2014;Towersetal.,2016;NishiuraandChowell,2014; Fismanetal.,2014)orestimatesbasedtransmissiontreedata(Faye etal.,2015;Cleatonetal.,2015).Moreover,itisworthnotingthat ourestimatesoftheeffectivereproductionnumberfollowa declin-ingtrendduringtheearlygrowthphase,apatternthatisinlinewith polynomialratherthanexponentialearlyepidemicgrowth dynam-ics(Chowelletal.,2015,2016b;Viboudetal.,2016).Polynomial epidemicgrowthcouldresultfromanumberoffactorsincluding contactnetworkcharacteristics(SalatheandJones,2010)and reac-tivebehaviorchangesthatgraduallymitigatethetransmissionrate (Chowelletal.,2015).Futureworkcouldperformsensitivity anal-ysisofthisestimationmethodwithrespecttothelengthofthe generationintervalaswasdoneinChowelletal.(2016a)withthe recentanalysisaZikavirusoutbreakinAntioquia,Colombia.
Althoughwearenotawareofawaytodirectlyobtainthe effec-tivereproductionnumberorthebasicreproductionnumberforthe generalizedRichardsmodel,aformulaforthebasicreproduction numberwasderivedfortheRichardsequationinWangetal.(2012). Theirderivationisbasedonthefactthatthegrowthrateshouldbe r/ainsteadofrintheRichardsmodel.
Simplephenomenologicalmodelscomposedofasmallnumber ofequationsandparametershaveshownpromiseingenerating
Fig.3.Meanestimatesoftheeffectivereproductionnumberfromthelogisticgrowthmodel(Eq.(1);leftcolumn)andthegeneralizedRichardsmodel(Eq.(2);rightcolumn) derivedfromEq.(3).Modelsprovidedreasonableforecastsamongallscenarios(rows).Predictionsforeachscenarioareobtainedbyfittingthecorrespondingmodeltoan increasingamountofepidemicdata:timepoints1,2,3,4and5respectivelyandusingEq.(3)withagammadistributedgenerationtimewithmean16daysandvarianceof 8days.
forecastsofepidemicimpactbasedonearlyoutbreakdata(e.g., (Chowelletal.,2014;NishiuraandChowell,2014;Fismanetal., 2013;Hsiehetal.,2004)).Forinstance,thewell-knownlogistic modelprovidesasimpledescriptionofasingleepidemicoutbreak usingonlytwoparameters:thegrowthraterandthefinalepidemic sizeK.However,alimitationofthisandothermodelsistherigid assumptionofearlyexponentialgrowthdynamics.Usingthe logis-ticmodel,theexponentialgrowthassumptionwasshowntowork relativelywelltodescribeandgenerateforecastsofthe2014Ebola epidemicinLiberia(Chowelletal.,2014),butitfailedtoprovide agoodfittotheearlyepidemicphaseoftheEbolaepidemicsin GuineaandSierraLeone(Chowelletal.,2014)wherepolynomial growthbettercharacterizedtheearlyepidemicgrowthphaseofthe epidemicinthosecountries(Chowelletal.,2015).Ourworkhere basedonsyntheticEbolaepidemicdataderivedfroma detailed agent-basedmodel(Merleretal.,2015)andarecentanalysisofa ZikaepidemicinAntioquia,Colombia(Chowelletal.,2016a) fur-theremphasizetheimportanceofdesigningmodelsthatreliably
capturetheepidemicgrowthphaseofepidemicoutbreaksinorder togenerateimproveddiseaseforecasts.
Reliablyassessingadevelopinginfectiousdiseaseoutbreakas quicklyas possibleallowsfor policymakerstomake swiftand wellinformed decisions onthe type and intensityof interven-tionsthat would be needed toensure epidemiccontrol. When substantialuncertaintysurroundsthetransmission,clinical,or epi-demiologicalcharacteristicsoftheinfectiousagentencumbersthe developmentof mechanistic transmissionmodels that incorpo-ratedetailsabouttransmissionmodes,epidemiologicalstages,and effectsofinterventions,phenomenologicalmodels(e.g.(Chowell etal.,2014;Fismanetal.,2013;HsiehandCheng,2006))basedona fewnumberofequationandparametershavethepotentialfor pro-vidingastartingpointtoforecastepidemicimpact(e.g.epidemic size),assesstheearlygrowthphaseduringthefirstfewdisease gen-erations,andcharacterizethereproductionnumber,andrepresent astartingpointtowardsa“firstresponse”suiteofmathematical modelsforaddressingemerginginfectiousdiseaseoutbreaks.
Fig.4. (leftcolumn)Short-termepidemicforecastsbasedonthegeneralized-growthmodelcalibratedusinganincreasingamountofepidemicdata(redline):6,8,and10 epidemicweeksintoeachscenario.Themean(blacksolidline)and95%confidencepredictioncone(shadedgrayareawithdashedborder)of200forecastingensembles areshown.At6,8and10weeks,themodelwastrainedonallpreviousdataandforecasted4weeksintothefuture.(rightcolumn)Meanestimatesandcorresponding95% confidenceintervalsofthedecelerationofgrowthparameter,p,derivedusingthegeneralized-growthmodelfittedtoanincreasingamountofcaseincidencedata:6,8,and 10epidemicweeks.(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)
Conflictsofinterest None.
Funding
BPandGCwouldliketoacknowledgefinancialsupportfrom theNSFfundedgrantDMS-15185.GCalsoacknowledgesfinancial support from the Division of International Epidemiology and PopulationStudies,TheFogartyInternationalCenter,USNational InstitutesofHealth,NSFgrant1414374aspartofthejoint NSF-NIH-USDAEcologyandEvolutionofInfectiousDiseasesprogram; UKBiotechnologyandBiologicalSciencesResearchCouncilgrant BB/M008894/1,NSF-IIS RAPIDaward #1518939, and NSF grant 1318788III:Small:DataManagementforReal-TimeDataDriven Epidemicsimulation.
AppendixA. Supplementarydata
Supplementarydataassociatedwiththisarticlecanbefound,in theonlineversion,athttp://dx.doi.org/10.1016/j.epidem.2016.11. 002.
References
Althaus,C.L.,2014.EstimatingthereproductionnumberofZaireebolavirus(EBOV)
duringthe2014outbreakinWestAfrica.PLOSCurr.OutbreaksEd.1,
101371/currentsoutbreaks91afb5e0f279e7f29e7056095255b288.
Balcan,D.,Hu,H.,Goncalves,B.,Bajardi,P.,Poletto,C.,Ramasco,J.J.,Paolotti,D., Perra,N.,Tizzoni,M.,VandenBroeck,W.,2009.Seasonaltransmission
potentialandactivitypeaksofthenewinfluenzaA(H1N1):aMonteCarlo
likelihoodanalysisbasedonhumanmobility.BMCMed.7,45.
Chowell,G.,Nishiura,H.,2014.TransmissiondynamicsandcontrolofEbolavirus
disease(EVD):areview.BMCMed.12(1),196.
Chowell,G.,Hengartner,N.W.,Castillo-Chavez,C.,Fenimore,P.W.,Hyman,J.M., 2004.ThebasicreproductivenumberofEbolaandtheeffectsofpublichealth
measures:thecasesofCongoandUganda.J.Theor.Biol.229(1),119–126.
Chowell,G.,Ammon,C.E.,Hengartner,N.W.,Hyman,J.M.,2006.Transmission
assessingtheeffectsofhypotheticalinterventions.J.Theor.Biol.241(2), 193–204.
Chowell,G.,Simonsen,L.,Viboud,C.,Kuang,Y.,2014.Iswestafricaapproachinga
catastrophicphaseoristheEbolaepidemicslowingdown?differentmodels
yielddifferentanswersforLiberia.PLoSCurr.2014(6).
Chowell,G.,Viboud,C.,Hyman,J.M.,Simonsen,L.,2015.TheWesternAfricaebola
virusdiseaseepidemicexhibitsbothglobalexponentialandlocalpolynomial
growthrates.PLoSCurr.7.
Chowell,G.,Hincapia-Palacio,D.,Ospina,J.,Pell,B.,Tariq,A.,Dahal,S.,Moghadas, S.,Smirnova,A.,Simonsen,L.,Viboud,C.,2016a.Usingphenomenological modelstocharacterizetransmissibilityandforecastpatternsandfinalburden ofzikaepidemics.PLOSCurr.Outbreaks,http://dx.doi.org/10.1371/currents.
outbreaks.f14b2217c902f453d9320a43a35b9583,pii:ecurrents.outbreaks.
f14b2217c902f453d9320a43a35b9583.
Chowell,G.,Viboud,C.,Simonsen,L.,Moghadas,S.M.,2016b.Characterizingthe
reproductionnumberofepidemicswithearlysub-exponentialgrowth
dynamics.J.R.Soc.Interface13(123),pii:20160659.
Chretien,J.P.,Riley,S.,George,D.B.,2015.Mathematicalmodelingofthewest
africaEbolaepidemic.eLife4.
Cleaton,J.M.,Viboud,C.,Simonsen,L.,Hurtado,A.M.,Chowell,G.,2015.
Characterizingebolatransmissionpatternsbasedoninternetnewsreports.
Clin.Infect.Dis.62(1),24–31.
Colizza,V.,Barrat,A.,Barthelemy,M.,Vespignani,A.,2006.Theroleoftheairline
transportationnetworkinthepredictionandpredictabilityofglobal
epidemics.Proc.Natl.Acad.Sci.U.S.A.103(7),2015–2020.
EbolaChallengewebsitehttp://www.ebola-challenge.org/.
Faye,O.,Boelle,P.Y.,Heleze,E.,Faye,O.,Loucoubar,C.,Magassouba,N.,Soropogui, B.,Keita,S.,Gakou,T.,Bahel,H.I.,etal.,2015.Chainsoftransmissionand
controlofEbolavirusdiseaseinConakryGuinea,in2014:anobservational
study.LancetInfect.Dis.15(3),320–326.
Fisman,D.N.,Hauck,T.S.,Tuite,A.R.,Greer,A.L.,2013.AnIDEAforshortterm
outbreakprojection:nearcastingusingthebasicreproductionnumber.PLoS
One8(12),e83622.
Fisman,D.,Khoo,E.,Tuite,A.,2014.Earlyepidemicdynamicsofthewest
africanebolaoutbreak:estimatesderivedwithasimpletwo-parametermodel.
PLoSCurr.2014(6).
Fraser,C.,2007.Estimatingindividualandhouseholdreproductionnumbersinan
emergingepidemic.PLoSOne2(8),e758.
Hsieh,Y.H.,Cheng,Y.S.,2006.Real-timeforecastofmultiphaseoutbreak.Emerg.
Infect.Dis.12(1),122–127.
Hsieh,Y.H.,Lee,J.Y.,Chang,H.L.,2004.SARSepidemiologymodeling.Emerg.Infect.
Dis.10(6),1165–1167,authorrely1167–1168.
Legrand,J.,Grais,R.F.,Boelle,P.Y.,Valleron,A.J.,Flahault,A.,2007.Understanding
thedynamicsofEbolaepidemics.Epidemiol.Infect.135(4),610–621.
Marquardt,D.W.,1963.Analgorithmforleast-squaresestimationofnonlinear
parameters.J.Soc.Ind.Appl.Math.,SIAM11,431–441.
Merler,S.,Ajelli,M.,Fumanelli,L.,Gomes,M.F.,Piontti,A.P.,Rossi,L.,Chao,D.L., LonginiJr.,I.M.,Halloran,M.E.,Vespignani,A.,2015.Spatiotemporalspreadof
the2014outbreakofEbolavirusdiseaseinLiberiaandtheeffectivenessof
non-pharmaceuticalinterventions:acomputationalmodellinganalysis.Lancet
Infect.Dis.15(2),204–211.
Moré,J.J.,1978.TheLevenberg-MarquardtAlgorithm:ImplementationandTheory
NumericalAnalysis.Springer,pp.105–116.
Nishiura,H.,Chowell,G.,2009.Theeffectivereproductionnumberasapreludeto
statisticalestimationoftime-dependentepidemictrends.In:Chowell,G.,
Hyman,J.M.,Bettencourt,L.M.A.,Castillo-Chavez,C.(Eds.),Mathematicaland
StatisticalEstimationApproachesinEpidemiology.Springer,TheNetherlands,
pp.103–121.
Nishiura,H.,Chowell,G.,2014.Earlytransmissiondynamicsofebolavirusdisease
(EVD),westafrica,marchtoaugust2014.EuroSurveill.19(36).
Pell,B.,Baez,J.,Phan,T.,Gao,D.,Chowell,G.,Kuang,Y.,2016.PatchmodelsofEVD
transmissiondynamics.In:Chowell,G.,Hyman,J.M.(Eds.),Mathematicaland
StatisticalModelingforEmergingandRe-emergingInfectiousDiseases.
Springer.
Salathe,M.,Jones,J.H.,2010.Dynamicsandcontrolofdiseasesinnetworkswith
communitystructure.PLoSComput.Biol.6(4),e1000736.
TeamWHOER,2014.Ebolavirusdiseaseinwestafrica–thefirst9monthsofthe
epidemicandforwardprojections.N.Engl.J.Med.371(16),1481–1495.
Towers,S.,Patterson-Lomba,O.,Castillo-Chavez,C.,2016.Temporalvariationsin
theeffectivereproductionnumberofthe2014westafricaebolaoutbreak.
PLOSCurr.Outbreaks,2014.
Viboud,C.,Simonsen,L.,Chowell,G.,2016.Ageneralized-growthmodelto
characterizetheearlyascendingphaseofinfectiousdiseaseoutbreaks.
Epidemics15,27–37.
Wang,X.S.,Wu,J.,Yang,Y.,2012.Richardsmodelrevisited:validationbyand
applicationtoinfectiondynamics.J.Theor.Biol.313,12–19.
chowell,G.,Viboud,C.,2016.Isitgrowingexponentiallyfast?–impactof
assumingexponentialgrowthforcharacterizingandforecastingepidemics