Improving
the
Reproducibility
of
LaTeX
Documents
by
Enriching
Figures
with
Embedded
Scripts
and
Data
ChristianT.Jacobs
DefenceScienceandTechnology Laboratory(Dstl)
Abstract
The introductionof openaccessdatapolicies by researchcouncils,the enforcementof bestpractices,andthedeploymentofpersistentonlinerepositorieshaveenableddatasets thatsupportresultsinscientificpaperstobecomemorewidelyaccessible. Unfortunately, despite this advancement in the curation/publishing workflow, the data-driven figures within apaper oftenremain difficult toreproduce. Plotting or analysis scripts rarely accompany the manuscriptor any associated software release; andeven if they do, it maybeunclearexactlywhichversionwasused. Furthermore,theprecisecommandsand parametersusedtoexecutethescriptsareoftennotincludedinaREADMEfileorinthe paper itself. Thispaperintroduces anewopen sourcedigitalcurationtool, Pynea, for improvingthe reproducibilityof LaTeXdocuments. Eachfigurewithinadocument is enrichedbyautomaticallyembeddingtheplottingscriptanddatafilesrequiredtogenerate it,suchthatitcanberegeneratedbyreadersofthepaperinthefuture. Thecommandused toexecutetheplottingscriptisalsoaddedtothefigure’smetadata,alongwithdetailsof thespecificversionofthescriptused(ifthescriptistrackedwiththeGitversioncontrol system). Ifthedocumentistoberecompiledwithafigurethathassincechanged,orhad itsplottingscriptordatafilesmodified,thefigureisregeneratedsuchthattheauthorcan beconfidentthatthelatestversionofthefigureanditsdependenciesareincluded.
Received06April2019 | Revisionreceived30June2019 | Accepted12August2019
CorrespondenceshouldbeaddressedtoDrChristianT.Jacobs,DefenceScienceandTechnologyLaboratory(Dstl), PortonDown,Salisbury,Wiltshire,SP40JQ,UnitedKingdom,Email:[email protected]
TheInternationalJournalofDigitalCurationisaninternationaljournalcommittedtoscholarlyexcellenceanddedicated totheadvancementofdigitalcurationacrossawiderangeofsectors. TheIJDCispublishedbytheUniversityof EdinburghonbehalfoftheDigitalCurationCentre.ISSN:1746-8256.URL:http://www.ijdc.net/
Copyrightrestswiththeauthors.ThisworkisreleasedunderaCreativeCommonsAttribution4.0 InternationalLicence.Fordetailspleaseseehttp://creativecommons.org/licenses/by/4.0/
InternationalJournalofDigitalCuration
2020,Vol.14,Iss.1,292–302. 292
Introduction
Motivation
Journalarticles and conference proceedingsremaina popular meansof disseminating importantresults,backedupbydata-drivenfiguresorplots,toavastaudience. However, ensuringthatthesefigurescanbereadilyreproducedby others,throughtheavailability ofthe underpinningsoftware anddata, isa crucialaspect ofthe workflow (deLeeuw,
2001) thathas largelybeenoverlookedin thepast. Thishas occasionallyled toa lack ofconfidence inresults by peers andthe potential for thecredibility of thedata to be calledintoquestion(Allison,Brown,George&Kaiser,2016;Donoho,Maleki,Rahman, Shahram&Stodden,2009;Stoddenetal.,2013). Thescientificresearchandpublication processisundergoingsignificantreforminresponsetothisso-calledreproducibilitycrisis (Baker,2016; LeVeque, Mitchell&Stodden, 2012). Indeed, recent effortsby funding bodies,publishers,andeducationalestablishmentsareattemptingtoaddressthisissueby facilitatingtheimplementationofreproducibleresearchpractices.
Therequirementtodepositdatafilesinopenly-accessiblerepositories,suchasthose provided by theUK Data Service1, Figshare2 or Zenodo3, is now mandatedby many fundingbodiessuchastheEconomicandSocialResearchCouncil (ESRC)(Economic &SocialResearch Council(ESRC),2017) andtheEngineering andPhysical Sciences ResearchCouncil(EPSRC)4.Thesestricterpoliciesandapproacheswillhopefullyfurther the longevity of the data that supports funded publications. Similarly, the release of thesoftware underopensourcelicencesalsoincreasesthetransparencyofthescientific workflowandalsoitspotentialuptakewithnewusers(Easterbrook,2014;vonKrogh& Spaeth,2007). Educationalresources, suchastheOpenScienceMassive OpenOnline Course(MOOC)5, are raising awareness of various techniques and tools available to researchers,anddemonstratinghowtousethemeffectively,tofacilitateamoreopenand transparentscientificworkflow. Finally,dedicatedsoftwarejournals,suchasSoftwareX6
andTheJournalofOpenSourceSoftware7,arearelativelynewphenomenonintherealm ofscientificpublishing. Notonlydotheyofferarouteforresearcherstoobtainrecognition forthiskey componentof theresearch workflow,they encouragethesubmissionofthe softwareitselftopersistentonlinerepositories;thisinturncanbereadily accomplished usingtoolssuchasPyRDM(Jacobs,Avdis,Gorman&Piggott, 2014)anddvn(Leeper,
2014). Asaresult,thelongevityofthesoftware’sexistenceisgreatlyenhanced,enabling itsfutureusebyotherresearchersaroundtheworld.
Unfortunately,despite theincreasing availabilityof theunderpinning software and data,theprocessofre-generating afigureinascientificmanuscriptisoftenstillfraught withdifficulty. Forexample,theprecisecommandsandparameterspassedtotheplotting scriptto produceaparticular figurearefrequently omittedfromREADMEfiles orthe
1 UKDataService: https://ukdataservice.ac.uk/ 2 figshare: https://figshare.com/
3 Zenodo:https://zenodo.org/
4 EPSRCwebsite-Expectations:https://epsrc.ukri.org/about/standards/researchdata/expectations/ 5 OpenScienceMOOC:https://opensciencemooc.eu/
6 SoftwareX:https://www.journals.elsevier.com/softwarex 7 TheJournalofOpenSourceSoftware:https://joss.theoj.org/
manuscript itself. Similarly, evenif the manuscript includes a ‘Data Statement’with linkstoalltherelevantdatasets,itmaybeunclearwhichonewasusedinagivenfigure. Furthermore,theindividualplottingscriptsthemselvesareoftennotpublishedalongwith thesoftwareusedtogeneratethedataitself;andevenwhentheyareincluded,theyoften donotcomewithaversionnumber,makingit difficultto ascertainwhetherthecorrect versionisbeingusedwhenattemptingtoregenerateafigure. Thesecrucialcomponentsof theanalysisandcurationworkfloware,unfortunately,frequentlyoverlooked(Vandewalle,
2012) making the process of reproducing a figure tedious, error-prone, or practically impossible. Addressing this issue requires better support for manuscript authors, particularly through theuse of software tools that make theprocess of reproducing a figureasstraight-forwardaspossible.
ImprovingtheReproducibilityofFigures
LaTeXisadocumentpreparationsystem(Lamport,1994)thatisespeciallypopularwithin thescientificcommunity. Whenanauthorwishestosubmittheirmanuscripttoajournal, theLaTeX‘source’/markupcanbecompiled,typesetandrenderedasasinglemanuscript in Portable Document Format (PDF) (Adobe Systems Incorporated, 2006) which is supportedbymostmajor publishers. Indeed, thePDFisextremelyprevalentwithinthe publishingindustrybecauseitoffersafeature-rich,cross-platform environment. Italso includestheabilitytoembedfileswithinamanuscript,andtoenrichthemanuscriptitself viametadatafields. Byexploitingthiswiderangeoffunctionalityitispossibletodevelop toolsthatcanfacilitatethereproducibilityofthefigureswithinthemanuscriptitself.
ToolssuchasSweave(Leisch,2002)andknitr(Gandrud,2015;Xie,2015)arecapable ofincorporating executablescripts into a LaTeX documentin an effortto enhance its reproducibility(see e.g. (Hinsen,2011, 2015) forotherapproaches towardsenhancing adocument’s reproducibility). WhentheLaTeX markupis compiled, thesescriptsare replacedby theiroutput(e.g. afigureoranumericalvalue) inthefinalPDFdocument. However, the underlying data dependencies and scripts would also need to be made availabletoensurethatthefiguresarereproduciblebythewiderscientificcommunityin additiontothedocument’sauthor. Itmaybemoredesirabletoembedthescriptsanddata withinthefigures,andsubsequentlyembedthosefiguresintothefinalPDFfile,inorder toformastandalonereproducibledocument.
This paper introduces a new digital curation tool, called Pynea (currently in the prototypestage),for improvingthereproducibilityofdata-driven figureswithinLaTeX documents. Byspecifyingthepathstotheplottingscriptanddatafilesrequiredtogenerate agivenfigurewithintheLaTeXdocument’smarkup,Pyneaautomaticallyregeneratesthe figurewiththesefilesembeddedsuchthatitcanbereproducedinthefuture. Thespecific commandusedtoexecutetheplottingscriptisalsoincludedwithinthefigure’smetadata. Moreover,iftheplottingscriptistrackedwiththeGitversioncontrolsystem(Chacon& Straub,2014)then thespecificversion ofthat scriptis alsonotedwithinthe metadata. ThefiguresarethensubsequentlyembeddedwithinthefinalPDFdocumenttoachievea standalonedocumentwithimprovedreproducibility.
\begin{figure}
\includegraphics{/home/christian/my_paper/images/plot.pdf} \pyneascript{/home/christian/my_scripts/plot.py}
\pyneacommand{python3 plot.py}
\pyneadata{/home/christian/my_data/data1.txt /home/christian ,→ /my_data/data2.txt}
\end{figure}
Figure1. Exampleusage ofPynea’s macrosapplied toa singlefigure environment inaLaTeX
source file. The path to the figure itself is specified first, followed by three macro commandsto specify thepath tothe script,the commandusedto executeit, andthe pathstoitstwodatafiles.
ReportStructure
The ‘Method’ section presents the software architecture behind Pynea, including a discussionofthedesignchoicesmade,any limitationsoftheapproach,andinformation regarding howto obtainthe software. Anexample applicationthatshowcases thekey featuresofPyneaisprovidedinthe‘Application’section. Thepaperfinisheswithsome conclusionsandrecommendations.
Method
Pyneais astandalone Pythonapplicationthat comprises severalstages. Put briefly, it firstensuresthatanyfigurestobeincludedintheLaTeXmanuscriptareup-to-dateand embedstherelevantplotting/analysisscript(hereafterreferredtoasthescript)anddata files. It then invokes a suitableLaTeX compiler,such as pdflatex8 (Thành, 2000), to collatethefiguresandgeneratethefinalPDFmanuscript. Pyneasubsequentlyattaches thefigurefiles themselvestothemanuscript. However, as aprerequisite step, the user mustspecifythe paths/locationsto theplottingscript,thecommandusedtorun it, and alsoanydatafilesitdependson. ThisisachievedbyaddingPynea-specificmacrostothe LaTeXmarkup,demonstratedinFigure1,whicharedefinedasfollows:
• \pyneascript{plot.py}tellsPyneathatthefigurewasgeneratedusingascript calledplot.py. Notethatscriptswritteninotherlanguagescanalsobeexecuted, aslongasthefigureoutputisinPDFformat.
• \pyneacommand{python3 plot.py}tellsPyneathatthescript istobeexecuted using Python 3. Note thatany pathsprovided hereare relative to the directory containingthescriptfile.
• \pyneadata{data1.dat data2.dat}tellsPyneathatthescriptdependsontwo datafiles,data1.datanddata2.dat.
8 pdfTeX-TeXUsersGroup:http://www.pdftex.org/
Followingthisprerequisitestep,Pyneacanberunatthecommandlinewiththeuser specifyingthepathto theLaTeX sourcefile. Figure2illustrates thekey tasksthatare subsequentlyperformed.
TheexampleLaTeXsourcefile,denotedexample.texinFigure2,isfirstparsedusing thethird-partyTexSoup9module. Thisextractsinformationabouteachfigure,namelythe expectedpathtothefigurefileitself(plot.pdfinFigure2)andtheinformationprovided inthethreemacros.
Eachfigure is thenconsidered inturn and may be regenerated prior to theLaTeX manuscriptbeingcompiled. Thisregenerationstepensuresthatthemostrecentversionof theplottingscriptanddatafileshavebeenusedtoproduceit. Furthermore,theusercan beconfidentthatthesamecommandspecifiedintheLaTeXfileistheonethathasbeen usedtoexecutethescript andproducethefigure. Afterafigurehasbeenregenerated,it isenrichedby usingthePyMuPDF10moduletoembedthefollowingfilesandmetadata forfuturereference:
• Thescript(embeddedfile). • Thedatafiles(embeddedfiles).
• Theplottingscript’sexecutioncommand(metadata).
• If thescript is underGitversion control, the versionof thescript thatwas used. ThistakestheformofaSHA-1hashfromGit’srevisionlog. (metadata).
All of these embedded resources help to improve each figure’s reproducibility. However, in the age of Big Data (Chen & Zhang, 2014), regenerating a figure can potentiallybeacomputationally-andmemory-intensiveoperation. Afigureistherefore regenerated only if thefile does not alreadyexist, if thescript or data files havebeen modified in the Git repository (determined using the GitPython11 module), or if the commandusedtoexecutethescripthaschanged. Ifthescriptordatafilesarenotunder Gitversioncontrol,Pyneaerrsonthesideofcautionandregeneratesthefigureeachtime. Ifthefigurehasnotbeenmodified,theexistingversionisused. Notealsothatonly PDF figuresarecurrentlysupportedbyPyneasuchthatfiguresinotherformatsarealwaysleft unchanged.
Afterall supported figureshavebeen enrichedwith theirscript file, data files, and commandmetadata, theLaTeX compilerisinvoked toproducethemanuscript inPDF format. However,whilethefigureswillberenderedintheirplaceholdersasexpected,the compilationstageignorestheembeddedfiles. Forthisreason,therawfigurefilesarealso attachedseparatelytothemanuscriptasafinalstepinPyneatoensurethescriptanddata filesremainaccessible. ThefinalPDFmanuscriptthereforehastwolayersofembedded files:
1. The inner layer comprises the scripts and data files (plot.py, data1.dat,
data2.datinFigure2)embeddedineachfigurefile.
2. Theouterlayercomprisestheenrichedfigurefiles(plot.pdfinFigure2)attached tothePDFmanuscript.
9 GitHub-alvinwan/TexSoup: https://github.com/alvinwan/TexSoup 10 GitHub-rk700/PyMuPDF:https://github.com/rk700/PyMuPDF
11 GitHub-gitpython-developers/GitPython: https://github.com/gitpython-developers/GitPython
Figure2. Aflowchartoutliningthekeystages ofPynea. ALaTeXdocumentcontainingonlya singlefigureisassumedhereforsimplicity,butPyneaiscapableofhandlingmultiple figures. Aredoutlineemphasisesthefactthatthefileisenrichedwithembeddedfiles.
OpeningtheresultingPDFmanuscriptinasuitablePDFreader,suchasAdobeReader, willdisplaytheembeddedfilesintheouterlayer. Thesecanbeextracted/savedmanually. Notethatembeddingthescriptanddatafilesinthefiguresthemselves,ratherthandirectly inthePDFmanuscript,allowsthefigurestobereusedelsewhereinareproduciblefashion (e.g. inslidedecksforconferencepresentations).
Itisworthnotingthat BigDataposesachallengefor theapproach taken byPynea, since the embedding of large data files would make the manuscript inconvenient to download,shareandstore,especiallyifthereaderdoesnotwishtoreproducethefigures within. When an author is ready to compile the final version of the manuscript, it maybemoreappropriatetoinsteadsubmitthedatadependencies(identifiedby thefile namesprovided in theLaTeX source) toa dedicated onlinerepository and embedthe repository’s Digital Object Identifier (DOI)12 and version information in the figures’ metadata. Readersofthemanuscriptcanthendownloadthedataseparately,shouldthey wishtodoso. However,caremustbetakenduringthiscurationstepinordertoguarantee thattheversionofthedataaccessibletoreadersisthesameversionusedbytheauthorto generatethemanuscript’sfigures. Pyneawouldideallyautomatethiscurationstageforthe author,facilitatedbytoolssuchasPyRDM(Jacobsetal.,2014)whichcaninterfacewith repositoryhostingservices. Suchanoptionofusingonlinerepositoriesisnotpossiblein thecurrent prototypeofPynea, butforms partoftheproject’sroadmapdiscussedinthe Conclusionssection.
Pynea is freely available to download under the open-source MIT licence from GitHub13.
12 DigitalObjectIdentifierSystem:https://www.doi.org/ 13 GitHub-ctjacobs/pynea:https://github.com/ctjacobs/pynea
\documentclass{article} \usepackage{graphicx} \usepackage{pynea}
\begin{document}
\section{Introduction}
This is an example of a LaTeX document containing an enriched figure ,→ file.
\begin{figure}[!ht] \begin{center}
\includegraphics[width=\columnwidth]{/home/christian/my_paper/ ,→ sine.pdf}
\pyneascript{/home/christian/my_scripts/sine.py} \pyneacommand{python3 sine.py}
\pyneadata{/home/christian/my_data/amplitude.dat /home/ ,→ christian/my_data/frequency.dat}
\caption{A sine wave.} \end{center}
\end{figure}
\end{document}
Figure3. ThecontentsoftheLaTeXsourcefile,paper.tex,usedinthesinewaveapplication.
Application
To demonstrate the outcomes of Pynea, considera Python script whichplots a time-dependent sine wave y(t) = Asin(2πf t), where A is the peak amplitude and f is the frequency. The plotting script readsin data from two files, amplitude.dat and
frequency.dat, forthequantitiesAand f respectively. Theresultingfigureisincluded withinamanuscriptusingtheLaTeXmarkupinFigure3.
After Pynea is run, Figure 4 shows how the script and data files are successfully embeddedinthesine.pdffigureviatheAttachmentstab/feature. User-definedfieldsare notpermittedwithinaPDF’smetadata. Therefore,Pyneaaddsanentryinthekeywords
metadatafieldinstead(also shownin Figure4), whichincludesthe Gitrevision ofthe scriptandthePythoncommandusedtoexecuteit. Similarly,thisfigurefileisattachedto thefinalPDFmanuscriptasdemonstratedinFigure5.
Figure4. The figurefile, sine.pdf,viewed using AdobeReader. The scriptanddata filesare listedbeneaththesinewaveplotintheAttachmentstab. Themetadataisalsoshownin theDocumentPropertiesdialogwindowontheright-handside.
Conclusions
Thecreation ofdigital curationtools, suchas Pynea, is a crucialstep towardsa more openandreproduciblescientific workflow. However, thesuccess ofsuchtools willbe largelydependent on their uptake and support by academia, industry, and publishers. Furthermore,anumberof issuesandcaveatswere identifiedduring thedevelopmentof thisworkwhichwillneedtobeaddressedtoimproveusability. Forexample,onecaveat istheinabilitytocreateuser-defined/custom metadatafieldsthroughPyMuPDF.Pynea worksaroundthisbyappendingtherelevantmetadatatothekeywordsfield, butthisis ofcoursenotideal. Moreover, fewfigureformats allowfor theembeddingoffilesina straight-forwardmanner,anditmaynotalwaysbepracticaltosaveeveryfigureasaPDF.
Restrictions that may be imposed by publishing bodies regarding the inclusion of embeddedfilescanpresentadditionalchallenges. Forexample,theembeddingoflarge datafileswillinturnincreasethefigure’sfilesize,forwhichtheremaybeanupperlimit imposedbythepublisher. Furthermore,whenajournalarticleisbeingtypesetusingthe publisher’sowntemplate,embeddedfilesmayautomaticallybestrippedout;particularly ifPDFfilesare‘flattened’,convertedtootherformats,orrenderedonthejournal’swebsite usingacustom/bespokePDFviewer,whichmaynotsupportembeddedfiles. Pyneaalso requirestheuseofcustommacros, whichmaynotbereadilyavailableinthepublisher’s template, although this will be less of an issue if the publisher manually embeds the enrichedfigurefilesprovidedbytheauthortothefinaltypesetmanuscriptinstead.
Pyneaiscurrentlyinaprototypestage. TheroadmapforPyneaincludestheaddition of a suite of tests in order to verify the functionality of Pynea and cover a range of use cases. Users are encouraged to submit any bug reports via Pynea’s GitHub repository,andcollaborativelysubmitimprovementstothecodebaseviamergerequests.
Figure5. The compiledLaTeXdocument, paper.pdf, viewedusing Adobe Reader. The indi-vidual figure file, which in turncontains the files shown in Figure 4, is listed inthe Attachmentstab.
Furtherimprovementsthatcouldalsobe madetoPyneaincludethesubmissionofdata dependenciesand plotting scripts to online repositories once the final version of the manuscriptis readyto be compiledby the author. Insteadof embeddingthe data and scriptsdirectlyintothecompiledmanuscript’sfigures,therepository’sDOIandversion informationcouldbeincludedinthemetadatainstead.
Acknowledgements
The author wishes to thank Neil Stapleton and an anonymous reviewer for their constructivefeedbackregardingthismanuscript.
References
Adobe Systems Incorporated. (2006). PDF Reference, sixth edition: Adobe Portable DocumentFormatversion1.7 (6th).AdobeSystemsIncorporated.
Allison,D.B.,Brown,A. W.,George,B.J. &Kaiser,K. A.(2016). Reproducibility:A tragedyoferrors.Nature,530,27–29.doi:10.1038/530027a
Baker,M. (2016).Isthereareproducibilitycrisis?Nature, 533,452–454. doi:10.1038/ 533452a
Chacon,S.&Straub,B.(2014).Progit(2nd).Apress.
Chen,C.L.P.&Zhang,C.-Y.(2014).Data-intensiveapplications,challenges,techniques and technologies: A survey on big data. Information Sciences, 275, 314–347.
doi:10.1016/j.ins.2014.01.015
de Leeuw, J. (2001). Reproducibleresearch. The bottom line. UCLA: Department of Statistics,UCLA.Retrievedfromhttps://escholarship.org/uc/item/9050x4r4
Donoho, D. L., Maleki, A., Rahman, I. U., Shahram, M. & Stodden, V. (2009). Reproducibleresearchincomputationalharmonicanalysis.ComputinginScience Engineering,11(1),8–18.doi:10.1109/MCSE.2009.15
Easterbrook,S.M.(2014).Opencodeforopenscience?NatureGeoscience,7,779–781.
doi:10.1038/ngeo2283
Economic & Social Research Council (ESRC). (2017). ESRC Research Funding Guide. Economic & Social Research Council (ESRC). Retrieved from https : //esrc.ukri.org/files/funding/guidance-for-applicants/research-funding-guide/
Gandrud,C.(2015).ReproducibleresearchwithRandRStudio(2nd).CRCPress.
Hinsen, K. (2011). A data and code model for reproducible research and executable papers.ProcediaComputerScience,4,579–588.doi:10.1016/j.procs.2011.04.061
Hinsen,K.(2015).ActivePapers:Aplatformforpublishingandarchivingcomputer-aided research.F1000Research,3.doi:10.12688/f1000research.5773.3
Jacobs,C.T.,Avdis,A.,Gorman,G.J.&Piggott,M.D.(2014).PyRDM:APython-based libraryforautomatingthemanagementandonlinepublicationofscientificsoftware anddata.JournalofOpenResearchSoftware,2,e28.doi:10.5334/jors.bj
Lamport,L.(1994).LaTeX-User’sGuideandReferenceManual(2nd).Addison-Wesley PublishingCompany.
Leeper, T. J. (2014). Archiving reproducible research with R and Dataverse. The R Journal,6(1),151–158.
Leisch,F.(2002).Sweave,PartI:MixingRandLaTeX.RNews,2(3),28–31.
LeVeque,R.J.,Mitchell,I.M.&Stodden,V.(2012).Reproducibleresearchforscientific computing: Toolsand strategies for changing theculture. Computing inScience Engineering,14(4),13–17.doi:10.1109/MCSE.2012.38
Stodden, V., Bailey, D. H., Borwein, J., LeVeque, R. J., Rider, W. & Stein, W. (2013). Settingthedefault toreproducible:Reproducibilityin computationaland experimentalmathematics.InProceedingsoftheicermworkshoponreproducibility incomputationalandexperimentalmathematics,10–14december2012.
Thành,H.T.(2000).Micro-typographicextensionstotheTeXtypesettingsystem(Doctoral dissertation,MasarykUniversity).
Vandewalle, P. (2012). Code sharing is associated with research impact in image processing.ComputinginScienceEngineering,14(4),42–47.doi:10.1109/MCSE .2012.63
vonKrogh,G.&Spaeth,S.(2007).Theopensourcesoftwarephenomenon:Characteristics thatpromoteresearch.TheJournal ofStrategicInformation Systems, 16(3),236–
253.doi:10.1016/j.jsis.2007.06.001
Xie,Y.(2015).DynamicdocumentswithRandknitr(2nd).CRCPress.