CEREBRUM: a fast and fully-volumetric Convolutional Encoder-decodeR for weakly-supervised sEgmentation of BRain strUctures from out-of-the-scanner MRI

(1)

Contents lists available at ScienceDirect

Medical

Image

Analysis

journal homepage: www.elsevier.com/locate/media

CEREBRUM:

a

fast

and

fully-volumetric

Convolutional

Encoder-decodeR

for

weakly-supervised

sEgmentation

of

BRain

strUctures

from

out-of-the-scanner

MRI

Dennis Bontempi

a

, Sergio Benini

a

, Alberto Signoroni

a

, Michele Svanera

b ,1 ,∗

, Lars Muckli

b ,1

a_Department_of_Information_Engineering,_University_of_Brescia,_Brescia,_Italy

b_Institute_of_Neuroscience_and_Psychology,_University_of_Glasgow,_Glasgow,_United_Kingdom

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received30September2019 Revised7March2020 Accepted12March2020 Availableonline24March2020

Keywords:

BrainMRIsegmentation Convolutionalneuralnetworks Weaklysupervisedlearning 3DImageanalysis

a

b

s

t

r

a

c

t

Manyfunctional and structural neuroimaging studies call for accuratemorphometric segmentationof differentbrainstructuresstartingfromimageintensityvalues ofMRIscans. Currentautomatic (multi-)atlas-based segmentationstrategies oftenlackaccuracy ondiﬃcult-to-segmentbrainstructures and, sincethesemethodsrelyonatlas-to-scanalignment,theymaytakelongprocessingtimes.Alternatively, recentmethods deploying solutionsbasedonConvolutional NeuralNetworks(CNNs) areenablingthe directanalysisofout-of-the-scannerdata.However,currentCNN-basedsolutionspartitionthetest vol-ume into2Dor3D patches,whichareprocessed independently. Thisprocess entailsaloss ofglobal contextualinformation,therebynegativelyimpactingthesegmentationaccuracy.Inthiswork,wedesign andtestan optimisedend-to-endCNNarchitecturethatmakestheexploitationofglobalspatial infor-mationcomputationallytractable,allowingtoprocessawholeMRIvolumeatonce.Weadoptaweakly supervisedlearningstrategybyexploitingalarge datasetcomposedof947out-of-the-scanner(3Tesla T1-weighted1mm isotropic MP-RAGE 3Dsequences) MRImages.The resultingmodel is ableto pro-duceaccuratemulti-structuresegmentationresultsinonlyafewseconds.Differentquantitativemeasures demonstrateanimprovedaccuracyofoursolutionwhencomparedtostate-of-the-arttechniques. More-over,througharandomisedsurveyinvolvingexpertneuroscientists,weshowthatsubjectivejudgements favouroursolutionwithrespecttowidelyadoptedatlas-basedsoftware.

1. Introduction

The segmentation of various brain structures from MRI scans is an essential process in several non-clinical andclinical analy-ses,such asthecomparison atvariousstagesof normalbrain, or disease developmentofneurodegenerative processes,neurological diseases,andpsychiatricdisorders.The morphometricapproachis especiallyhelpful inpathologicalsituationsforconfirmingthe di-agnosis, defining theprognosis, andselecting thebest treatment. Moreover, brainstructure segmentation is an early step in func-tional MRI (fMRI)studypipelines, asneuroscientists need to iso-late specific brain structures before analysing the spatiotemporal patternsofactivitywithinthem.

∗ _{Corresponding}_author.

E-mailaddress:[email protected](M. Svanera).

1_Shared_authorship.

Manualsegmentation,althoughconsideredtobethegold stan-dard intermsof accuracy, istime consuming(Zhan et al., 2018 ). Therefore, neuroscience studies began to exploit computer vi-sion to process data from increasingly performing MRI scanners and ease the interpretation of brain data, intrinsically charac-terised by a strong inter-subject variability. Different fully auto-matedpipelines havebeendevelopedinrecentyears (Despotovi ´c et al., 2015 ), moving from techniques based only on image fea-turestoonesthat make alsouseofa-prioristatisticalknowledge aboutthe neuroanatomy. The vast majorityof the available tools applya(multi-)atlas-basedsegmentation strategy(Cabezas et al., 2011 ),inwhichthesegmentationofthetargetvolume isinferred fromone or severaltemplates builtfrommanual annotations. In order to make this inference phase possible, a time consuming andcomputationallyintensive(FreeSurfer, 2008 )non-rigid subject-to-atlas alignment is necessary. Due to the aforementioned high inter-subject brain variability, such registration procedures often introduceerrorsthatyieldadecreaseinsegmentationaccuracyon https://doi.org/10.1016/j.media.2020.101688

(2)

brainstructureortissueboundaries(Klein et al., 2017; Lerch et al., 2017 ).

In recent years, Deep Learning (DL) techniques have emerged as one of the most powerful ways to combine statistical mod-elling of the data with pattern recognition for decision making and classification (Voulodimos et al., 2018 ), and their develop-mentisimpactingvariousmedicalimagingdomains(Hamidinekoo et al., 2018; Litjens et al., 2017 ). Provided that they are trained onasufficientamountofdataembodyingtheobservable variabil-ity, DL models are able to generalise well to previously unseen data.Furthermore,they canwork directlywithout-of-the-scanner images,removing the need forthe expensivescan-to-atlas align-ment phase. Numerous DL-based algorithms proposed for brain MRI segmentation match oreven improve the accuracy of atlas-basedsegmentation tools(Akkus et al., 2017; Rajchl et al., 2018; Roy et al., 2019; Wachinger et al., 2018 ). Due to the scarcity of trainingdata and to hardware limitations, approaching this task usingDLcommonlyrequires thevolume tobe processed consid-ering2D(Roy et al., 2019 )or3D-patches(Fedorov et al., 2017; Ra- jchl et al., 2018; Dolz et al., 2019; Wachinger et al., 2018; Li et al., 2017 )atatime.Althoughthismethodsimplifiestheprocessfrom atechnicalpointofview,itintroducessignificantlimitationsinthe analysis:since each 2D or3D patchis segmentedindependently fromthe others,these models mostly exploit local spatial infor-mation-ignoring “global” cues,such asthe absoluteandrelative positions of different brain structures - which makes them sub-optimal. Different works have considered the potential improve-mentsofremoving saidvolumepartitioning(McClure et al., 2018; Wachinger et al., 2018 ).Solutionsthatexploitsuchfully-volumetric approach have already been applied to prostate (Milletari et al., 2016 ),heartatrium (Savioli et al., 2019 ), andproximal femurMRI segmentation (Deniz et al., 2018 ), but not yet to brain MRI seg-mentation - where this strategy could prove particularly useful giventhecomplexgeometryandthevarietyofstructures charac-terisingthe brainanatomy. Here, we discusshow both hardware limitations and the scarcity of hand-labelled ground truth (GT) data can be overcome. First, we tackle the former by customis-ing and simplifying the model architecture. Second, the latter is copedwithbytrainingourmodelonsegmentationmasksobtained exploiting atlas-based techniques, in what can be considered a weaklysupervisedfashion-morepreciselywhat(Zhou, 2017 )and (Li et al., 2019 ) describe as“inaccurate supervision”. Hence,even thoughCEREBRUMistrainedexploitinglabellingwhichisnot ex-emptfromerrors,wedemonstratethatthestatisticalreliabilityof atlas-basedsegmentation isenough toguaranteegood generalisa-tioncapabilityoftheDLmodelstrainedonsuchimperfectground truth.

2. ExistingmethodsforwholebrainMRIsegmentationand howtoadvancethem

2.1.Atlas-basedmethods

In the last twenty years, several atlas-based segmentation methods have been developed. However, only a few of them are completely automatic, and thus pertinent to our discus-sion: FreeSurfer,FSL’s FAST and FMRIB, andfMRIprep. FreeSurfer (Fischl, 2012 ) is anopen-source software package thatcontainsa completely automated pipeline for tissue and sub-cortical brain structure segmentation. FSL’s FAST (FMRIB’s Automated Segmen-tation Tool, Zhang et al., 2001 ) and FIRST (FMRIB’s Integrated Registration and Segmentation Tool, Patenaude et al., 2011 ) are partofthe Oxford’sopen-source libraryof analysistoolsforMRI and fMRI data. FAST segments different tissue types in already skull-stripped brain scans, while FIRST deals with the segmen-tation of sub-cortical brain structures. fMRIprep (Esteban et al.,

2019 ) is a recently published preprocessing software for MRI scans that combines tools from widely used open-source neu-roimagingpackages(e.g.,theabovementionedFSLandFreeSurfer). It implements a brain tissues segmentation pipeline, provid-ing the user with both soft (i.e., probability maps) and hard segmentation.

These methods are widely used in neuroscience, since they produce consistent resultswith little humanintervention. Never-theless, they are all atlas-based and not learning-based - hence, the only way to improve their accuracy is to manually produce newatlases.Furthermore,sincetheyimplementalongprocessing pipelinetogether with theatlas-basedlabellingstrategy, the seg-mentationoperationistimeconsuming(FreeSurfer, 2008 ). Limita-tionsoftheseapproaches, suchasthelackofaccuracyonvarious brainstructureboundaries,havebeendocumented(Ellingsen et al., 2016; Wenger et al., 2014; Weier et al., 2012; Cabezas et al., 2011 ).

2.2. Deeplearningmethods

Many ofthe state-of-the-art methods based on deeplearning exploitmulti-modalMRIdata(Çiçek et al., 2016; Chen et al., 2018; Dolz et al., 2019; Andermatt et al., 2016 ). Yet,inreal-case scenar-ios anddue to time constraints,the acquisition of different MRI sequencesforanatomicalanalysisisrarelydone:inmoststudiesa singlesequenceisused-withT1wbeingthemostpopular proto-col.Variousalternativeshavebeenproposedtoobtainwholebrain segmentation from T1w only. QuickNAT (Roy et al., 2019 ) lever-ages a 2D based approach to efficiently segment brain MRI, ex-ploiting a paradigm that aggregates the predictions of three dif-ferentencoder-decodermodelsbyaveragingtheprobabilitymaps -eachmodeltrainedtosegmentasinglesliceatatimealongoneof thethreeprincipalaxes(longitudinal,sagittal,andcoronal). Mesh-Net (Fedorov et al., 2017; McClure et al., 2018 ) is a feedforward CNNbasedon3Ddilatedconvolutions,whosestructureguarantees good results while keepingthe number of parameters low. Neu-roNet(Rajchl et al., 2018 )isanencoder-multi-decoderCNN,trained toreplicate segmentationresults obtainedwithmultiple state-of-the-art neuroimaging tools. DeepNAT (Wachinger et al., 2018 ) is composed of a cascadeof two CNNs. It breaks the segmentation taskintotwohierarchicaloperations-theforeground-background separation,andthelabellingofeachvoxelasbelongingtothe fore-ground-implementedbythefirstandthesecondnetwork, respec-tively.Finally,thesolutionpresentedin Li et al. (2017) makesuse of various refinements, such as residual connections and dilated convolution, to favour the learning of 3D representation and in-creasethecompactnessoftheproposedmodel.Suchmodifications arefurthermoreatthecentreoftheextensiveanalysisconducted bythe authorsinanefforttoexplain howtheformerimpactthe modelperformance.

However,acommontraitofthesemethodsisthattheydonot fully exploit the 3D spatial nature of MRI data. Although Quick-NATtriestointegrate spatialinformationby averagingthe proba-bilitymaps computedwith respect todifferent views, it is slice-based.DeepNATexploitsanintrinsicparameterisationofthebrain (throughthe Laplace-Beltramioperator)trying to introducesome spatialcontext, but aswith MeshNet it is trainedon small non-overlapping 3D-patches. Finally, NeuroNet is trained on random crops of the MR volume, and so is the high-resolution compact CNNpresentedin Li et al. (2017) .

2.3. Aimsandcontributions

Aiming to exploit both local and global spatial information contained in MRI data, we introduce CEREBRUM: a fast and fully-volumetricConvolutionalEncoder-decodeRforweakly super-vised sEgmentation of BRain strUctures from out-of-the-scanner

(3)

Fig.1. Overviewoftheproposedsegmentationmethod.Themodelistrainedon900T1wvolumesandtheassociatedrelabelledFreeSurfersegmentation,whiletestingis

performedbyfeedingNIfTIdatatothemodel.

MRI. To the best of our knowledge, CEREBRUM is the ﬁrst DL model designed to tackle the brain MRI segmentation task in such a fully-volumetric fashion. This is accomplished exploiting an end-to-end encoding-decoding structure, where only convo-lutional blocks are used. This delivers a whole brain MRI seg-mentation in just ∼5–10 s on a desktop GPU. The model ar-chitecture and the proposed learning framework are shown in Fig. 1 .

Since in most real case scenarios, to save scanner time, only single-modal MR images are collected, we develop and test our method on alarge set ofdata (composedby 947 MRI scans) ac-quired usinga T1-weighted (T1w) 1mmisotropic MPRAGE proto-col. Neither registrationnor ﬁltering is applied to thesedata, so thatCEREBRUMlearnstosegmentout-of-the-scannervolumes. Fo-cusing on the requirements of a real case scenario (fMRI stud-ies), we trainthemodelto segmentthe classesofinterestinthe MICCAI challenge (Mendrik et al., 2015 ) i.e., gray matter (GM), white matter (WM), cerebrospinalﬂuid (CSF), ventricles, cerebel-lum,brainstem, andbasalganglia.Sincemanuallyannotatingsuch alargebodyofdatawouldrequireaprohibitiveamountofhuman hours,wetrainourmodelonautomaticsegmentationsobtainedby FreeSurfer(Fischl, 2012 )-relabelledtoobtaintheaforementioned setofsevenclasses.

WecomparetheproposedmethodwithotherCNN-based solu-tions:the well-known 2D-patch-based U-Net(Ronneberger et al., 2015 ), its 3D variant (Çiçek et al., 2016 ), andthe state-of-the-art architecture QuickNAT(Roy et al., 2019 )-whichleveragesthe ag-gregation of three slightly modified U-Net architectures (trained oncoronal,sagittal,andaxialMRIslices,respectively).Toensurea faircomparison,wetrainthesemodelsbyconductinganextensive hyperparameter selection process. Results are quantitatively eval-uated exploiting the samemetrics used inthe MICCAI MRBrain Segmentation challenge, i.e., the Dice Similarity Coefficient, the 95thHausdorff Distance,andtheVolumetric SimilarityCoefficient (Taha and Hanbury, 2015 ), utilisingFreeSurferasGT reference.In addition, to assess the generalisation capability of the proposed model,wecomparetheobtainedresultsagainsttheFreeSurfer seg-mentation we used fortraining. To doso, we designa survey in whichfiveexpertneuroscientists(withmorethanfiveyearsof ex-perience inMRI analysis) are asked to choose themost accurate segmentationbetweenthetwoaforementionedones.This qualita-tivetestcoversdifferentareasofinterestinneuroimagingstudies, i.e.,theearlyvisualcortex(EVC),thehigh-levelvisualareas(HVC), the motor cortex (MCX), the cerebellum (CER), the hippocampus (HIP),theearlyauditorycortex(EAC),thebrainstem(BST)andthe basalganglia(BGA).

Allthecodenecessary totrainCEREBRUM andrun thesurvey isavailableattheproject’sGitHubpage.2

3. Materialandmethods 3.1. Data

To speed up research and promote reproducibility, numerous large-scale neuroimaging experiments make the collected data available toall researchers(Marcus et al., 2007; Van Essen et al., 2013; Oxtoby et al., 2019; Miller et al., 2016; Bellec et al., 2017 ). However,noneofthesestudiesprovidemanualannotations,as car-ryingout theoperation on suchlarge databaseswould prove ex-ceptionallytime-consuming.

Forthis reason, mostof the studies investigatingthe applica-tionof DLarchitecturesfor brainMRI segmentation makeuse of automaticallyproducedGTfortrainingpurposes(Roy et al., 2019; McClure et al., 2018; Fedorov et al., 2017; Rajchl et al., 2018 ) -withsome ofthem reportingthe lattercan be exploitedto train modelsthatperformthesame(Rajchl et al., 2018 ),orevenbetter (Roy et al., 2019 ),thantheautomatedpipelineitself.Motivatedby this rationale, we train and test the proposed model using both a large collection of out-of-the-scanner MR images and the re-sultsoftheFreeSurfer(Fischl, 2012 )corticalreconstructionprocess,

recon-all

,asreferenceGT.Asanticipatedin Section 1 ,we rela-belthisresultpreservingsevenamongthemostimportantclasses ofinterestinmostoffMRIstudies(see Section 2.3 and Fig. 1 ).

The database, collected from the Centre for Cognitive Neu-roimaging (the University of Glasgow) in more than 10 years of machine activity, consists of 947 MR images- 900 ofwhich are usedfortraining,11forvalidation,and36fortesting.Allthe vol-umesareout-of-the-scanner,i.e,obtaineddirectlyfromasetof DI-COM images using

dcm2niix

(Li et al., 2016 ), whose auto-crop optionisexploitedto makesizesconsistentacrossall thedataset (i.e.,192×256×170forsagittal,coronal,andlongitudinalaxis, re-spectively)withoutanyotherpre-processingofthedata.Giventhe numberofavailablescansfortraining,andsincenoregistrationis performed,thevariabilityinshape,rotation,position,and anatom-icalsizeissuchthatnodataaugmentationisneededtoavoidthe riskofoverﬁtting.Theﬁrsttwocolumnsof Fig. 2 (a)and(b)show detailedviewsfromsomeselectedslicesoftheout-of-the-scanner T1wandthecorrespondingrelabelledFreeSurfersegmentation, re-spectively.Themaincharacteristicsofthedatasetaresummarised in Table 1 .Asthe datahavebeencollectedunderdifferentethics

(4)

Fig.2. Out-of-the-scanner(contrastenhanced)T1wscan(left),FreeSurfersegmentation(middle),andtheresultproducedbyourmodel(right).Fig.(a)depictsslicesoftest

Subject1,while(b)slicesoftestSubject4(sagittal,coronal,andlongitudinalview,respectively).Casesofwhitematterover-segmentationarehighlightedbyyellowcircles, whilecasesofwhitematterunder-segmentationarehighlightedbyturquoisecircles(bestviewedinelectronicformat).

Table1

Datasetsdetails.MRImagesacquiredattheCentreforCognitiveNeuroimaging (UniversityofGlasgow,UK).

Parameter Value

Sequenceused T1wMPRAGE

Fieldstrenght 3Tesla

Voxelsize 1mm-isotropic

Originalvolumesizes 192×256×256

Trainingvolumesizes 192×256×170b

Training 900volumes

Validation 11volumes

Testing 36volumesa

a ₇_of_which_are_publicly_available.

b _{out-of-the-scanner}_data,_neck_cropping_only.

applications,wearenotabletomakethewholedatabasepublicly available.However, 7 outof 36volumesused fortestingare col-lectedundertheapprovalofthelocalethicscommitteeofthe Col-legeofScience&Engineering(ethics#300170016)andshared on-lineafteranonymisation,3 _for_comparison_and_research _purposes,

alongwiththesegmentationmasksresultingfromCEREBRUMand FreeSurfer(See Fig. 2 and Section 4.2 ).

3.2.Proposedmodel

Tomake thecomplexityofmanagingour192×256×170 vox-els data tractable, we carefully optimise the model architecture so as to implicitly deal with GPU memory constraints. Further-moreweexploit, fortrainingpurposes,amachine equippedwith 4GeForce® GTX1080Ti-distributingdifferentpartsofthemodel ondifferentGPUs.

Inspiredby Ronneberger et al. (2015) and Çiçek et al. (2016) ,we proposeadeepencoder-decodermodelwithsix3Dconvolutional

3_{https://openneuro.org/datasets/ds002207/versions/1.0.0}_.

blocks,which arearranged inincreasing number onthree layers. Sinceawholevolumeisconsideredasan input,thefeaturemaps extractedby suchconvolutional blocksare notlimitedto patches butspanacrosstheentirevolume.Aseachblockcapturesthe con-tentofthewholebrainMRI,thisenablesthelearningofbothlocal andglobalspatialfeaturesbyleveragingthespatialcontextwhich is propagated to each subsequent block. The capability of CERE-BRUMtolearnbothlocalandglobalfeaturesiscoherentwiththe lastlayerunitsofthemodelhavinga100x100x100theoretical re-ceptivefield.Atablereportingthecompletecalculationofsuch pa-rameterforeachconvolutionalblockofCEREBRUMcanbefoundin theSupplementaryMaterial.Inordertobetterexploitthefine de-tailsfoundin3TbrainMRIdata,kernelsofsize3×3×3areused as feature extractors. Instead of max-pooling, convolutions with strideare usedasadimensionalityreductionmethod,thus allow-ingthenetworktolearntheoptimaldown-samplingstrategy start-ingfromtheextractedfeatures.Exploitingsuchoperations,andto forcethelearningofmoreabstract(spatial)features,afactor1:64 dimensionality reduction is implemented after the first layer. Fi-nally,skipconnectionsareusedalongwithtensorialsum(instead ofconcatenation, Quan et al., 2016 ) toimprovethequality ofthe segmentedvolume whilegreatly limitingthenumberof parame-tersto ∼5M,farlesswithrespecttostate-of-the-artmodelswhich arestructuredinasimilarfashion.

Wetrainthemodelbyoptimisingthecategoricalcross-entropy function.Convergenceisachievedafterroughly24hoursof train-ing(40 epochs),usingAdam(Kingma and Ba, 2014 )witha learn-ingrateof42_·10−5_,

β

1=0.9,and

β

2=0.999.Furthermore,weset thebatchsizeto1andthusdonotimplementbatchnormalisation (Ioffe and Szegedy, 2015 ).

4. Results

The resultswe presentin thissection aim toconﬁrm the hy-pothesis that avoiding the partitioning of MRI data enables the

(5)

CEREBRUM to learn global spatial features useful for improv-ing segmentation. At ﬁrst, in Section 4.1 , we provide numerical comparison withother state-of-the-art CNN architectures(U-Net, Ronneberger et al., 2015 ; 3D U-Net, Çiçek et al., 2016 ; QuickNAT, Roy et al., 2019 ). Then, in Section 4.2 , we conduct a survey in-volvingexpertneuroscientiststosubjectivelyassesstheCEREBRUM segmentationaccuracy.Finally,wefurtherverifythevalidityofour assumptions by inspecting the soft-segmentation maps produced by the modelsin Section 4.3 ,andwe demonstratethe suitability ofourdatasetby analysingtheimpact ofthetrainingsetsizeon CEREBRUMperformancein Section 4.4 .

4.1. Numericalcomparison

We numerically assess the performance of the models, us-ing FreeSurfersegmentation asa reference,exploitingthemetrics utilisedintheMICCAIMRBrainS18challenge(amongthemost em-ployed in the literature, Taha and Hanbury, 2015 ). Dice (similar-ity)Coeﬃcient(DC)isameasureofoverlap,andacommon met-ric in segmentation tasks.The Hausdorff Distance, a dissimilarity measure,isusefultogain someinsightoncontourssegmentation. SinceHDisgenerallysensitivetooutliers,amodiﬁedversion(95th percentile,HD95)isgenerallyusedwhendealingwithmedical im-agesegmentationevaluation(Huttenlocher et al., 1993 ).Finally,the Volumetric Similarity (VS), asin Crdenes et al. (2009) , evaluates thesimilaritybetweentwovolumes.

CEREBRUM is compared against state-of-the-art encoder-decoder architectures: the well-known-2D-patch based U-Net (Ronneberger et al., 2015 , trained on the three principal views, i.e., longitudinal,sagittal, andcoronal), the 3D-patchbasedU-Net 3D (Çiçek et al., 2016 -with3D patches sized 64_×64_×64, asin Çiçek et al., 2016; Fedorov et al., 2017; Pawlowski et al., 2017 ), and the QuickNAT architecture (Roy et al., 2019 ), which imple-mentsview-aggregationstartingfrom2D-patchbasedmodels.We train all themodels minimisingthe sameloss for50epochs, us-ingthesamenumberofvolumes,andsimilarlearningrates(with changesinthoseregardsmadetoensurethebestpossible valida-tionscore). Fig. 3 showsclass-wiseresults(DC,HD95,andVS) de-pictingtheaveragescore(computedacrossallthe36testvolumes) andthestandarddeviation.Wecompare2D-patch-based (longitu-dinal,sagittal,coronal),QuickNAT,3D-patch-based,andCEREBRUM (both a max pooling and strided convolutions version). Overall, the latter outperforms all the other CNN-based solutions on ev-ery class, despite having far less parameters: when its average score(computedacrossallthesubjects)iscomparablewiththatof other methods (e.g.,view-aggregation,GM),ithasa smaller vari-ability(suggesting higherreliability). Moreover,we determinethe p-valuesforsuchscorescomputingapairedt-testusingasa refer-encethestrided-convolutionsversion ofCEREBRUM.In Fig. 3 , sta-tistically signiﬁcantﬁndings (p_<0.05)are highlighted with aster-isks,whereasthenumericalresultsarereportedinthe Supplemen-taryMaterials.

4.2. Experts’qualitativeevaluation

The quantitative assessment presented in Section 4.1 , though informative, cannot be considered exhaustive. Indeed, using FreeSurfer as a referencefor such evaluation makes the latter a ranking on a relative scale - and if this highlights the value of the fully-volumetric approach, it does not make a direct com-parison withthe atlas-based method possible. Thus, we need to conﬁrm more systematically what can be inferred, for instance, from Fig. 2 -wherefarsuperiorqualitative performanceof CERE-BRUM are clear compared to FreeSurfer, asthe former produces moreaccuratesegmentationmasks,withfarlessholesandbridges. This somehow surprising generalisation capability of CEREBRUM

overits trainingreference, ifconﬁrmed, wouldprovethe desired “strengthening” effectyieldedbythe adoptionofaweakly super-vised learning approach. Moreover, quantitative assessments are oftencriticised by humanexperts,such asphysicians and neuro-scientists, forthey do not take into account the severity ofeach segmentationerror(Taha and Hanbury, 2015 ),which isofcritical importanceinprofessionalusagescenarios.

For the aforementioned reasons, we design and implement a systematic subjective assessment by means of a PsychoPy (Peirce, 2007 )testinwhichﬁveexpertneuroscientists(withmore than ﬁveyears ofexpertise inMRI analysis) are askedto choose the most accurate segmentation between the one produced by CEREBRUM and the (relabelled) FreeSurfer one. The participants arepresentedwithacoronal,sagittal,oraxialsliceselectedfroma testvolume,andareallowedbothtonavigatebetweenfour neigh-bouringslices(twofollowingandtwoprecedingthedisplayedone) andtochangetheopacity ofthesegmentation mask (from0% to 100%) to better evaluate the latter with respect to the anatomi-cal data.Thisprocess isrepeatedseven times-one foreachtest subject-pereachoftheeightbrainareasofinterest,i.e.,early vi-sualcortex(EVC),thehigh-levelvisualareas(HVC),themotor cor-tex(MCX),thecerebellum(CER),thehippocampus(HIP),theearly auditorycortex (EAC), the brainstem(BST),andthe basal ganglia (BGA).Thechoice oftheslicestopresentandtheorderinwhich thelatterare arrangedis randomised.Furthermore,the neurosci-entistsareallowedtoskipasmanyslicesastheywantifthey are unsureabout the choice: such cases are reportedseparately.The survey interface and a run example are provided in the Supple-mentaryMaterial.Fromtheresultsshownin Fig. 4 itemergesthat, accordingto expertneuroscientists, CEREBRUM qualitatively out-performsFreeSurfer.Thisprovesthemodelsuperiorgeneralisation capability and provides evidence to support the adopted weakly supervisedapproach. Moveover,such results hintat the possibil-itytohaveatlas-basedmethodsanddeeplearningonesoperating togetherinasynergisticway.

4.3.Probabilitymapsandentropymeasures

Tofurtherinvestigatethehypothesisthatafully-volumetric ap-proachisadvantageouswithrespecttootherpatch-basedmodels, wealsoconductanassessmentonthepredictedprobabilitymaps (i.e., softsegmentation). Such evaluation could clearlyreveal the ability ofthe model to make use of spatialcues: for instance, a well-learnedmodelwhichexploits learnedspatialfeaturesshould predictthepresence ofcerebellumvoxels onlyintheback ofthe brain,wherethestructureisnormallylocated.

Fig. 5 (a) and (b) show two selected slices of the soft seg-mentation(percent probability,displayedinlogarithmicscale) re-sulting from the best 2D-patch-based method (i.e., QuickNAT), the3D-patch-basedmethod,andCEREBRUM- forthecerebellum andbasalgangliaclasses,respectively(superimposedtothe corre-spondingT1wslice).Otherclassesareomittedforclarity.

Theprobability mapsproduced bythe2D and3D-patchbased methods are characterised by the presence of voxels associated withsigniﬁcantprobability ofbelongingtothe structureof inter-est(p>0.2)despitetheirdistancefromthelatter.Thiscanleadto misclassiﬁcationerrorsinthehardsegmentation(afterthe thresh-olding). Inparticular, higher uncertaintyand spurious activations duetoviewsaveragingcanbeseeninthesoftsegmentationmaps producedbyQuickNAT-whileblockingartefactsonthepatch bor-dersare visible inthe caseof the 3D-U-Net,even when the lat-teristrainedusingoverlapping3D-patcheswhosepredictionsare thenaveraged.Thesoftsegmentation producedbyCEREBRUM,on thecontrary,ismorecoherentandclosertothereferenceinboth cases,anddoesnotpresenttheaforementionederrors.

(6)

Fig.3. DiceCoeﬃcient,95thpercentileHausdorff Distance,andVolumetricSimilaritycomputedusingFreeSurferrelabelledsegmentationasareference.The 2D-patch-based(red,green,blue,andgreyforlongitudinal,sagittal,coronal,andview-aggregation,respectively),the3D-patch-based(pink),andourmodel(yellowformax-pooling andorangeforstridedconvolutions)arecompared.The heightofthe barindicatesthemeanacross allthe testsubjects,whiletheerrorbarrepresents thestandard deviation.Theasterisksbelowthebarshighlightstatisticallysigniﬁcantresults(p<0.05),wherethep-valueisobtainedfromapairedt-testcomputedwithrespecttothe strided-convolutionsversionofCEREBRUM,labelledwith“ref.” (bestviewedinelectronicformat).

(7)

Fig.4. Outcomeofthesegmentationaccuracyassessmenttest,conductedby ex-pertneuroscientists,for thefollowingareas:earlyvisualcortex(EVC),the high-levelvisualareas(HVC),themotorcortex(MCX),the cerebellum(CER),the hip-pocampus(HIP),theearlyauditorycortex(EAC),thebrainstem(BST),andthebasal ganglia(BGA).Thebarsrepresentthenumberofpreferencesexpressedbythe ex-perts:CEREBRUM(inorange),FreeSurfer(inblue),ornoneofthetwo(ingrey).(For interpretationofthereferencestocolourinthisﬁgurelegend,thereaderisreferred tothewebversionofthisarticle.)

Beside suchqualitative evaluations,we quantitatively compare thesparsenessofthepredictedprobabilitymapsexploitingthe av-eragevoxel-wiseentropyHV,deﬁnedas:

HV

(

V,DNN

)

=

v∈VHv

(

v

,DNN

)

|

V

|

(1)

whereVisanMRIvolume,DNNatrainedDLmodel,andvasingle voxel.Hence,|V|isthetotalnumberofvoxelsinthevolume,and thesummationin Eq. (1) iscomputedforevery voxelvinV.The quantity Hv is the voxel-wise entropy, deﬁned startingfrom the

classicaldeﬁnitionin Shannon (1948) : Hv

(

v

,DNN

)

=−

c∈C

P

(

DNN

(

v

)

∈c

)

ln

P

(

DNN

(

v

)

∈c

)

(2) where C is the set of the segmented classes (in our case C=

GM,GANGL,...,BRNSTEM

),P

(

DNN

(

v

)

∈c

)

istheprobabilitythe model assignsto the event “voxel v belongsto theclass c”, and ln

P

(

DNN

(

v

)

∈c

)

is the naturallogarithm of such quantity. We report the results of such test in Fig. 6 (a), “normalising” the quantity HV by the highest average voxel-wise entropy

achiev-able HMAX

V , i.e., HV computedfora voxelforwhich every classis

predicted as equiprobable - so that HV/HVMAX∈[0,1] for ease of

interpretation.

If entropy evaluates the sparseness of the predicted probabil-itymapsingeneral,cross-entropyisabletoassesstheuncertainty

ofa model, oncethe correctpredictions are known. The average voxel-wise cross-entropy CHV buildsupon the idea of the

voxel-wiseentropyCHv,deﬁnedas:

CH_v

(

v

,GT ,DNN

)

=−

c∈C

P

(

GT

(

v

)

∈c

)

ln

P

(

DNN

(

v

)

∈c

)

(3) whereGTisthe ground-truthreferenceprovided by FreeSurfer,C

isthesetofthesegmentedclasses,P

(

GT

(

v

)

∈c

)

istheprobability relatedtotheground-truthevent“voxelvbelongsto theclassc” (i.e.,“1” for the correct class,and “0” otherwise,since FreeSurfer does not provide class probabilities), and ln

P

(

DNN

(

v

)

∈c

)

is the natural logarithm of the probability the DNN model assigns to the event “voxel v belongs to the class c”. We report the re-sultsof such test in Fig. 6 (b),“normalising” the quantity CHV by

thehighestaveragevoxel-wiseentropyachievableCHMAX

V ,i.e., the

cross-entropy computed by using a random classiﬁer - so that

CHV/ CHVMAX∈[0, 1].

Bothqualitative examplesillustratedin Fig. 5 ,andquantitative evaluationspresentedin Fig. 6 ,hintatthesuperiorability ofthe proposedmodelinlearningbothglobalandlocalspatialfeatures. Additionalqualitativeexamplesofprobabilitymaps,aswellasthe tablesreporting the p-valuesfor the testsdepicted in Fig. 6 , are providedintheSupplementaryMaterial.

4.4.Numberoftrainingsamples

One of the possible limitations of approaching the brainMRI segmentation task in a fully-volumetric fashion could be the scarcity of training data - for in such a case each volume does not yield many training samples, as for 2D and 3D-patch-based solutions, but a single one. To investigate this possible draw-back, we evaluate the performance of CEREBRUM when trained on smaller sub-sets of our database. In particular, we train the proposed model by randomly extracting 25, 50, 100, 250, 500, 700, 900 samples from the training set. To evaluate the perfor-mance of the model in the first two cases (i.e., 25 and 50 MRI scans), we repeat the training 5 times (on randomly extracted yetnon-overlappingsubsetsof thedatabase) andaveragethe re-sults. Furthermore, we evaluate the impact on the performance yielded by the introduction of strided convolutions (i.e., more learnable parameters) when the training set size is limited by traininga variation of CEREBRUM where max-pooling is used as a dimensionality-reductionstrategy. Fig. 7 showsthat the perfor-mancevariation significantly deteriorates asthe trainingset size fallsbelow250samples,whilesubstantialstabilityisreachedover 750 samples.This confirms that our 900 samples training set is properlysizedforthetask,withouttherebeinganyurgefordata augmentation.

Fig.5. Softsegmentationmapsoftestsubject1cerebellum(a)andthebasalganglia(b)producedbythebest2D-patch-basedmodel(QuickNAT),the3D-patch-basedmodel (3DU-Net),andCEREBRUM(ours).Theproposedapproachproducesresultsthatarespatiallymorecoherent,andlackoffalsepositives(highlightedinlightblue).Base-10 logarithmicscaleofpercentprobabilityisusedforvisualisationpurposes(bestviewedinelectronicformat).

(8)

Fig.6. Normalisedaveragevoxel-wiseentropyHV/HVMAX(a),andnormalisedaveragevoxel-wisecross-entropyCHV/CHVMAX(b).Theprobabilitymapsresultingfrom

2D-patch-based(red,green,blue,andgreyforlongitudinal,sagittal,coronal,andview-aggregation,respectively),the3D-patch-based(pink),andourmodels(yellowformax-pooling andorangeforstridedconvolutions)arecompared.The heightofthe barindicatesthemeanacross allthe testsubjects,whiletheerrorbarrepresents thestandard deviation.Theasterisksbelowthebarshighlightstatisticallysigniﬁcantresults(p<0.05),wherethep-valueisobtainedfromapairedt-testcomputedwithrespecttothe strided-convolutionsversionofCEREBRUM,labelledwith“ref.” (bestviewedinelectronicformat).

Fig.7. Impactofthetrainingsetsizeontheperformance-DiceCoeﬃcient aver-agedacrossallthesevenclasses.Resultsarecomputedonthewholetestset(36 volumes).

5. Conclusion

In this work we presented CEREBRUM, a CNN-based deep modelthat approachesthebrainMRI segmentation problemina fully-volumetric fashion. The proposed architecture is a carefully (architecturally) optimised encoder-decoder that, starting from a T1wMRIvolume,producesaresultinonlyfewsecondsona desk-topGPU.Weevaluatedtheproposedmodelperformance, compar-ingittostate-of-the-art2Dand3D-patch-basedmodelswith sim-ilar structure, exploitingthe Dice Coeﬃcient, the 95thpercentile Hausdorff Distance,andtheVolumetricSimilarity,assessing CERE-BRUMsuperiorperformance. Furthermore,weconductedasurvey ofexpertneuroscientists toobtaintheir judgementsaboutthe ac-curacy of the resulting segmentation, comparing the latter with theresultof FreeSurfercortical reconstructionprocess. According to theparticipants to such experiment, CEREBRUM achieves bet-ter segmentation than FreeSurfer. To our knowledge, this is the

ﬁrsttimeaDL-basedfully-volumetricapproachforbrainMRI seg-mentation isdeployed. The resultswe obtainedprove the poten-tialofthisapproach,asCEREBRUMoutperforms2Dand 3D-patch-based encoder-decodermodels usingfar lessparameters. Remov-ing the partitioning of the volume, as hypothesised, allows the model to learn both local and spatial features. Furthermore, we arealsotheﬁrstconductingaqualitativeassessmenttest consult-ingexpertneuroscientists:thisisfundamental,ascommonlyused metrics oftenfailto capturetheinformationexpertsneed torely onDLmethodsandexploitthelatterforresearch.

DeclarationofCompetingInterest None.

CRediTauthorshipcontributionstatement

Dennis Bontempi: Conceptualization, Methodology, Software, Formal analysis, Investigation, Data curation, Writing - original draft,Writing-review&editing,Visualization.SergioBenini: Con-ceptualization, Writing - original draft, Writing - review & edit-ing, Supervision,Project administration. AlbertoSignoroni: Writ-ing - review & editing,Supervision, Fundingacquisition. Michele Svanera:Conceptualization,Methodology,Validation,Formal anal-ysis, Investigation, Data curation, Supervision,Project administra-tion.LarsMuckli:Resources,Fundingacquisition.

Acknowledgments

This project has received funding from the European Unions Horizon 2020 Programme for ResearchandInnovation under the SpeciﬁcGrantAgreementNo. 785907 (HumanBrainProjectSGA2) awardedtoLM.

Supplementarymaterial

Supplementary material associated with this article can be found,intheonlineversion,atdoi:10.1016/j.media.2020.101688 . References

Akkus,Z.,Galimzianova,A.,Hoogi,A.,Rubin,D.L.,Erickson,B.J.,2017.Deep learn-ingforbrainMRIsegmentation:stateoftheartandfuturedirections.J.Digit Imaging30(4),449–459.doi:10.1007/s10278-017-9983-4.

(9)

Andermatt,S.,Pezold,S.,Cattin,P.,2016.Multi-dimensionalgatedrecurrentunits for thesegmentationofbiomedical3d-data.In:Carneiro,G.,Mateus,D., Pe-ter,L.,Bradley,A.,Tavares,J.M.R.S.,Belagiannis,V.,Papa,J.P.,Nascimento,J.C., Loog,M.,Lu,Z.,Cardoso,J.S.,Cornebise,J.(Eds.),DeepLearningandData Label-ingforMedicalApplications.SpringerInternationalPublishing,Cham,pp.142– 151.doi:10.1007/978-3-319-46976-8_15.

Bellec,P.,Chu,C.,Chouinard-Decorte,F.,Benhajali,Y.,Margulies,D.S.,Craddock,R.C., 2017.TheneurobureauADHD-200preprocessedrepository.Neuroimage144, 275–286.doi:10.1016/j.neuroimage.2016.06.034.

Cabezas,M., Oliver,A.,Lladó,X.,Freixenet,J.,BachCuadra,M.,2011.Areviewof atlas-basedsegmentationformagneticresonancebrainimages.Comput. Meth-odsProgramsBiomed.104(3),e158–e177.doi:10.1016/j.cmpb.2011.07.015. Chen,H.,Dou,Q.,Yu,L.,Qin,J.,Heng,P.-A.,2018.Voxresnet:deepvoxelwiseresidual

networksforbrainsegmentationfrom3dmrimages.Neuroimage170,446–455. doi:10.1016/j.neuroimage.2017.04.041.

Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O., 2016. 3D U-Net: Learning dense volumetricsegmentation from sparse annotation. In: Ourselin,S.,Joskowicz,L.,Sabuncu,M.R.,Unal,G.,Wells,W.(Eds.),Medical Im-ageComputingandComputer-AssistedIntervention–MICCAI2016.Springer In-ternationalPublishing,Cham,pp.424–432.

Crdenes,R.,deLuis-Garca,R.,Bach-Cuadra,M.,2009.Amultidimensional segmen-tationevaluationformedicalimagedata.Comput.MethodsProgramsBiomed. 96(2),108–124.doi:10.1016/j.cmpb.2009.04.009.

Deniz,C.M., Xiang,S.,Hallyburton,R.S.,Welbeck,A.,Babb,J.S.,Honig,S.,Cho,K., Chang, G., 2018. Segmentation of the proximal femurfrom mrimages us-ingdeep convolutional neuralnetworks.Sci.Rep.8(1),16485.doi:10.1038/ s41598-018-34817-6.

Despotovi´c, I., Goossens, B., Philips, W.,2015. MRI Segmentation ofthe human brain: challenges, methods, and applications. Comput. Math. MethodsMed. 2015,450341(1–23).doi:10.1155/2015/450341.

Dolz, J., Gopinath, K.,Yuan, J., Lombaert, H.,Desrosiers, C., Ben Ayed, I., 2019. HyperDense-Net:ahyper-denselyconnectedCNNformulti-modalimage seg-mentation.IEEETrans.Med.Imaging38(5),1116–1126.doi:10.1109/TMI.2018. 2878669.

Ellingsen,L.M.,Roy,S.,Carass,A.,Blitz,A.M.,Pham,D.L.,Prince,J.L.,2016. Segmen-tationandlabelingoftheventricularsysteminnormalpressurehydrocephalus usingpatch-basedtissueclassiﬁcationandmulti-atlaslabeling.In:Styner,M.A., Angelini,E.D.(Eds.),MedicalImaging2016:ImageProcessing.SPIE,pp.116–122. doi:10.1117/12.2216511.InternationalSocietyforOpticsandPhotonics. Esteban, O., Markiewicz, C.J., Blair, R.W., Moodie, C.A., Isik, A.I., Erramuzpe, A.,

Kent,J.D.,Goncalves,M.,DuPre,E.,Snyder,M.,Oya,H.,Ghosh,S.S.,Wright,J., Durnez, J.,Poldrack,R.A., Gorgolewski,K.J.,2019.fMRIPrep:arobust prepro-cessingpipelineforfunctionalMRI.Nat.Methods16(1),111–116.doi:10.1038/ s41592-018-0235-4.

Fedorov,A.,Johnson,J., Damaraju,E.,Ozerin, A.,Calhoun,V.,Plis,S.,2017. End-to-endlearningofbraintissuesegmentationfromimperfectlabeling.In:2017 International JointConference on Neural Networks (IJCNN), pp. 3785–3792. doi:10.1109/IJCNN.2017.7966333.

Fischl,B.,2012.FreeSurfer.Neuroimage62(2),774–781.doi:10.1016/j.neuroimage. 2012.01.021.

FreeSurfer, 2008. Recon-all run times. https://surfer.nmr.mgh.harvard.edu/fswiki/ ReconAllRunTimes.[Online;accessed11-September-2019].

Hamidinekoo,A.,Denton,E.,Rampun, A.,Honnor,K.,Zwiggelaar,R., 2018. Deep learninginmammographyandbreasthistology,anoverviewandfuturetrends. Med.ImageAnal.47,45–67.doi:10.1016/j.media.2018.03.006.

Huttenlocher,D.P.,Klanderman,G.A.,Rucklidge,W.J.,1993.Comparingimagesusing thehausdorff distance.IEEETrans.PatternAnal.Mach.Intell.15(9),850–863. doi:10.1109/34.232073.

Ioffe,S.,Szegedy,C.,2015.Batchnormalization:acceleratingdeepnetworktraining byreducinginternalcovariateshift. arXiv:1502.03167.

Kingma, D. P., Ba, J., 2014. Adam: a method for stochastic optimization. arXiv preprintarXiv:1412.6980.

Klein,A.,Ghosh,S.S.,Stavsky,E.,Lee,N.,Rossa,B.,Reuter,M., Neto,E.C., Kesha-van,A.,2017.Mindbogglingmorphometryofhumanbrains.PLoSComput.Biol. 13(2),e1005350.doi:10.1371/journal.pcbi.1005350.

Lerch, J.P., van der Kouwe, A.J.W., Raznahan, A., Paus, T., Johansen-Berg, H., Miller, K.L., Smith, S.M., Fischl, B., Sotiropoulos, S.N., 2017. Studying neu-roanatomyusingMRI.Nat.Neurosci.20(3),314–326.doi:10.1038/nn.4501. Li, W., Wang, G.,Fidon, L., Ourselin,S., Cardoso,M.J., Vercauteren,T., 2017. On

thecompactness,eﬃciency,andrepresentationof3dconvolutionalnetworks: Brainparcellationasapretexttask.In:Niethammer,M.,Styner,M.,Aylward,S., Zhu,H.,Oguz,I.,Yap,P.-T.,Shen,D.(Eds.),InformationProcessinginMedical Imaging.SpringerInternationalPublishing,Cham,pp.348–360.

Li,X.,Morgan,P.S.,Ashburner,J.,Smith,J.,Rorden,C.,2016.Theﬁrststepfor neu-roimagingdataanalysis:DICOMtoNIfticonversion.J.Neurosci.Methods264, 47–56.doi:10.1016/j.jneumeth.2016.03.001.

Li,Y.,Guo,L.,Zhou,Z.,2019.Towardssafeweaklysupervisedlearning.IEEETrans. PatternAnal.Mach.Intell.1.EarlyAccess.

Litjens,G.,Kooi,T.,Bejnordi,B.E.,Setio,A.A.A.,Ciompi,F.,Ghafoorian,M.,vander Laak, J.A.,vanGinneken,B., Snchez,C.I.,2017. Asurveyon deeplearning in medicalimageanalysis.Med.ImageAnal.42,60–88.doi:10.1016/j.media.2017. 07.005.

Marcus,D.S.,Wang,T.H.,Parker,J.,Csernansky,J.G.,Morris,J.C.,Buckner,R.L.,2007. Open access series of imaging studies (OASIS): cross-sectionalmri data in

young,middleaged,nondemented,anddementedolderadults.J.Cogn. Neu-rosci.19(9),1498–1507.doi:10.1162/jocn.2007.19.9.1498.

McClure,P.,Rho,N.,Lee,J.A.,Kaczmarzyk,J.R.,Zheng,C.,Ghosh,S.S.,Nielson,D., Thomas,A.,Bandettini,P.,Pereira,F.,2018.KnowingWhatYouKnowinBrain SegmentationUsingDeepNeuralNetworks.arXiv:1812.01719[cs,stat]. Mendrik,A.M., Vincken,K.L.,Kuijf,H.J.,Breeuwer,M.,Bouvy,W.H.,deBresser,J.,

Alansary,A.,deBruijne,M.,Carass,A.,El-Baz,A.,Jog,A.,Katyal,R.,Khan,A.R., van der Lijn, F., Mahmood, Q., Mukherjee, R., van Opbroek, A., Paneri, S., Pereira, S., Persson, M., Rajchl, M., Sarikaya, D., Smedby, O., Silva, C.A., Vrooman,H.A.,Vyas,S.,Wang,C.,Zhao,L.,Biessels,G.J.,Viergever,M.A.,2015. MRBrainSchallenge:onlineevaluationframeworkforbrainimagesegmentation in3TMRIscans.Comput.Intell.Neurosci.2015,1–16.doi:10.1155/2015/813696. Miller, K.L., Alfaro-Almagro, F., Bangerter, N.K., Thomas, D.L., Yacoub, E., Xu, J., Bartsch, A.J., Jbabdi, S., Sotiropoulos, S.N., Andersson, J.L.R., Griffanti, L., Douaud,G.,Okell,T.W.,Weale,P.,Dragonu,I.,Garratt,S.,Hudson,S.,Collins,R., Jenkinson,M.,Matthews,P.M.,Smith,S.M.,2016.Multimodalpopulationbrain imagingintheUKbiobankprospectiveepidemiologicalstudy.Nat.Neurosci.19. doi:10.1038/nn.4393.1523EP–

Milletari,F.,Navab,N.,Ahmadi,S.,2016.V-Net:Fullyconvolutionalneuralnetworks forvolumetricmedicalimagesegmentation.In:2016FourthInternational Con-ferenceon3DVision(3DV),pp.565–571.doi:10.1109/3DV.2016.79.

Oxtoby,N.P.,Ferreira,F.S.,Mihalik,A.,Wu,T.,Brudfors,M.,Lin,H.,Rau,A., Blum-berg,S.B.,Robu,M.,Zor,C.,etal.,2019.ABCDNeurocognitivePrediction Chal-lenge2019:PredictingIndividualResidualFluidIntelligenceScoresfromCortical GreyMatterMorphology.arXiv:1905.10834.

Patenaude,B.,Smith,S.M.,Kennedy,D.N.,Jenkinson,M.,2011.ABayesianmodelof shapeandappearanceforsubcorticalbrainsegmentation.Neuroimage56(3), 907–922.doi:10.1016/j.neuroimage.2011.02.046.

Pawlowski,N.,Ktena,S.I.,Lee,M.C.H.,Kainz,B.,Rueckert,D.,Glocker,B.,Rajchl,M., 2017.DLTK:stateofthe artreferenceimplementations fordeep learning on medicalimages. ariXiv:1711.06853.

Peirce,J.W.,2007.Psychopypsychophysicssoftwareinpython.J.Neurosci.Methods 162(1–2),8–13.

Quan,T.M.,Hildebrand,D.G.,Jeong,W.-K.,2016.FusionNet:adeep fully resid-ualconvolutional neural network for imagesegmentation in connectomics. arXiv:1612.05360.

Rajchl,M.,Pawlowski,N.,Rueckert,D.,Matthews,P.M.,Glocker,B.,2018.NeuroNet: FastandRobustReproductionofMultipleBrainImageSegmentationPipelines. arXiv:1806.04224.

Ronneberger, O., Fischer, P., Brox, T., 2015. U-Net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi,A.F.(Eds.),MedicalImageComputingandComputer-Assisted Interven-tion–MICCAI2015.SpringerInternationalPublishing,Cham,pp.234–241. Roy,A.G.,Conjeti,S.,Navab,N.,Wachinger,C.,ADNI,2019.QuickNAT:afully

con-volutionalnetworkforquickandaccuratesegmentationofneuroanatomy. Neu-roimage186,713–727.

Savioli,N.,Montana, G.,Lamata, P., 2019. V-FCNN: Volumetricfullyconvolution neuralnetworkforautomaticatrialsegmentation.In:Pop,M.,Sermesant,M., Zhao,J.,Li,S.,McLeod,K.,Young,A.,Rhode,K.,Mansi,T.(Eds.),Statistical At-lasesandComputationalModelsoftheHeart.AtrialSegmentationandLV Quan-tiﬁcationChallenges.SpringerInternationalPublishing,Cham,pp.273–281. Shannon,C.E.,1948.Amathematicaltheoryofcommunication.BellSyst.Tech.J.27

(3),379–423.doi:10.1002/j.1538-7305.1948.tb01338.x.

Taha,A.A.,Hanbury,A.,2015.Metricsforevaluating3Dmedicalimage segmenta-tion:analysis,selection,andtool.BMCMed.Imaging15(1),29.

VanEssen,D.C.,Smith,S.M.,Barch,D.M.,Behrens,T.E.,Yacoub,E.,Ugurbil,K.,2013. TheWU-minnhumanconnectomeproject:anoverview.Neuroimage80,62–79. doi:10.1016/j.neuroimage.2013.05.041.

Voulodimos,A.,Doulamis,N.,Doulamis,A.,Protopapadakis,E.,2018.Deeplearning forcomputervision:abriefreview.Comput.Intell.Neurosci.2018,1–13.doi:10. 1155/2018/7068349.

Wachinger,C.,Reuter,M.,Klein,T.,2018.DeepNAT:DeepConvolutionalNeural Net-workfor segmentingneuroanatomy.Neuroimage170,434–445. doi:10.1016/j. neuroimage.2017.02.035.

Weier,K.,Beck,A.,Magon,S.,Amann,M.,Naegelin,Y.,Penner,I.K.,Thürling,M., Aurich, V., Derfuss, T., Radue, E.-W., Stippich, C., Kappos, L., Timmann, D., Sprenger,T.,2012.Evaluationofanewapproachforsemi-automatic segmen-tationofthecerebelluminpatientswithmultiplesclerosis.J.Neurol.259(12), 2673–2680.doi:10.1007/s00415-012-6569-4.

Wenger, E., Mårtensson, J., Noack, H., Bodammer, N.C., Kühn, S., Schaefer, S., Heinze,H.-J.,Düzel,E.,Bäckman,L.,Lindenberger,U.,Lövdén,M.,2014. Com-paringmanualandautomaticsegmentationofhippocampalvolumes: reliabil-ityand validityissues inyounger and olderbrains:comparing manualand automaticsegmentationofhcvolumes.Hum.BrainMapp.35(8),4236–4248. doi:10.1002/hbm.22473.

Zhan,M.,Goebel,R.,deGelder,B.,2018.Ventralanddorsalpathwaysrelate differ-entlytovisualawarenessofbodyposturesundercontinuousﬂashsuppression. eNeuro5(1).doi:10.1523/ENEURO.0285-17.2017.

Zhang,Y.,Brady, M.,Smith, S.,2001.SegmentationofbrainMRimagesthrough ahiddenMarkovrandomﬁeldmodelandtheexpectation-maximization algo-rithm.IEEETrans.Med.Imaging20(1),45–57.doi:10.1109/42.906424. Zhou,Z.-H.,2017.Abriefintroductiontoweaklysupervisedlearning.Natl.Sci.Rev.

10.1016/j.media.2020.101688 .

ScienceDirect

www.elsevier.com/locate/media

(

https://github.com/denbonte/CER3BRUM

https://openneuro.org/datasets/ds002207/versions/1.0.0

Horizon 2020

9983-

15

10.1016/j.neuroimage.2016.06.034

10.1016/j.cmpb.2011.07.015

10.1016/j.neuroimage.2017.04.041

424–432

10.1016/j.cmpb.20

34817-

10.1155/2015/450341

2878669

10.1117/12.2216511

0235-

10.1109/IJCNN.2017.7966333

2012.01.021

11-September-2019].

10.1016/j.media.2018.03.006

10.1109/34.232073

1502.03167

1412.6980

10.1371/journal.pcbi.1005350

10.1038/nn.4501

348–360

10.1016/j.jneumeth.2016.03.001

2019.

10.1016/j.media.2017.

10.1162/jocn.2007.19.9.1498

1812.01719

10.1155/2015/813696

10.1038/nn.4393

10.1109/3DV.2016.79

1905.10834

10.1016/j.neuroimage.2011.02.046

1711.06853

8–13

1612.05360

1806.04224

234–241

713–727

273–281

10.1002/j.1538-7305.1948.tb01338.x

29

10.1016/j.neuroimage.2013.05.041

9

10.1016/j.

6569-

10.1002/hbm.22473

10.1523/ENEURO.0285-17.2017

10.1109/42.906424

44–53