• No results found

Crowd science: The organization of scientific research in open collaborative projects

N/A
N/A
Protected

Academic year: 2021

Share "Crowd science: The organization of scientific research in open collaborative projects"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

ContentslistsavailableatScienceDirect

Research

Policy

j o ur na l h o me p a g e :w w w . e l s e v i e r . c o m / l o c a t e / r e s p o l

Crowd

science:

The

organization

of

scientific

research

in

open

collaborative

projects

Chiara

Franzoni

a,∗

,

Henry

Sauermann

b

aDepartmentofManagement,EconomicsandIndustrialEngineering(DIG),PolitecnicodiMilano,P.LeonardodaVinci32,Milan20133,Italy bGeorgiaInstituteofTechnology,SchellerCollegeofBusiness,800W.PeachtreeSt.,Atlanta,GA30308,USA

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received15November2012 Receivedinrevisedform12July2013 Accepted13July2013

Available online 14 August 2013 Keywords: Crowdscience Citizenscience Crowdsourcing Community-basedproduction Problemsolving Openinnovation Funding

a

b

s

t

r

a

c

t

Agrowingamountofscientificresearchisdoneinanopencollaborativefashion,inprojectssometimes referredtoas“crowdscience”,“citizenscience”,or“networkedscience”.Thispaperseekstogainamore systematicunderstandingofcrowdscienceandtoprovidescholarswithaconceptualframeworkand anagendaforfutureresearch.First,webrieflypresentthreecaseexamplesthatspandifferentfields ofscienceandillustratetheheterogeneityconcerningwhatcrowdscienceprojectsdoandhowthey areorganized.Second,weidentifytwofundamentalelementsthatcharacterizecrowdscienceprojects –openparticipationandopensharingofintermediateinputs–anddistinguishcrowdsciencefrom otherknowledgeproductionregimessuchasinnovationcontestsortraditional“Mertonian”science. Third,weexplorepotentialknowledge-relatedandmotivationalbenefitsthatcrowdscienceoffersover alternativeorganizationalmodes,andpotentialchallengesitislikelytoface.Drawingonpriorresearch ontheorganizationofproblemsolving,wealsoconsiderforwhatkindsoftasksparticularbenefitsor challengesarelikelytobemostpronounced.Weconcludebyoutlininganagendaforfutureresearchand bydiscussingimplicationsforfundingagenciesandpolicymakers.

© 2013 The Authors. Published by Elsevier B.V.

1. Introduction

Forthelastcentury,scientificactivityhasbeenfirmlyplaced inuniversitiesorotheracademicorganizations,government lab-oratories,or in theR&D departments of firms.Sociologists and economists,inturn,havemadegreatprogressunderstandingthe functioningofthisestablishedsystemofscience(Merton,1973; Zuckerman,1988;DasguptaandDavid,1994;Stephan,2012).The lastfewyears,however,havewitnessedtheemergenceofprojects thatdonotfitthemoldoftraditionalscienceandthatappearto followdistinctorganizingprinciples.Foldit,forexample,isa large-scalecollaborativeprojectinvolvingthousandsofparticipantswho advanceourunderstandingofproteinfoldingatanunprecedented speed, usinga computergame astheirplatform. GalaxyZoo is a projectinvolvingover250,000 volunteerswho helpwiththe collectionofastronomicaldata,andwhohavecontributedtothe discoveryofnewclassesofgalaxiesandadeeperunderstanding

∗ Correspondingauthor.Tel.:+39223994823;fax:+39223992730. E-mailaddresses:[email protected](C.Franzoni),

[email protected](H.Sauermann).

oftheuniverse.Finally,PolymathinvolvesacolorfulmixofFields Medalistsandnon-professionalmathematicianswhocollectively solveproblemsthathavelongeludedthetraditionalapproachesof mathematicalscience.

Whileacommontermfortheseprojectshasyettobefound, theyarevariouslyreferredtoas“crowdscience”,“citizenscience”, “networkedscience”,or“massively-collaborativescience”(Young, 2010;Nielsen,2011;WigginsandCrowston,2011).Eventhough thereissignificantheterogeneityacrossprojects,theyarelargely characterizedbytwoimportantfeatures:participationinaproject isopentoawidebaseofpotentialcontributors,andintermediate inputssuchasdataorproblemsolvingalgorithmsaremadeopenly available.Whatwewillcall“crowdscience”isattractinggrowing attentionfromthescientificcommunity,butalsopolicymakers, fundingagenciesandmanagerswhoseektoevaluateitspotential benefitsandchallenges.1Basedontheexperiencesofearlycrowd scienceprojects,theopportunitiesareconsiderable.Amongothers, crowdscienceprojectsareabletodrawontheeffortand knowl-edgeinputsprovidedbyalargeanddiversebaseofcontributors,

1Forexample,scientificjournalshavepublishedspecialissuesoncitizenscience,

thetopichasbeendiscussedinmanagerialoutletssuchastheSloanManagement Review(Brokaw,2011),nationalfundingagenciesintheUSandothercountries activelyfundcrowdscienceprojects,andtheLibraryofCongressisdiscussinghow crowdscienceartifactssuchasblogsanddatasetsshouldbepreservedandcurated. 0048-7333© 2013 The Authors. Published by Elsevier B.V.

http://dx.doi.org/10.1016/j.respol.2013.07.005

Open access under CC BY-NC-ND license.

(2)

potentiallyexpandingtherangeofscientificproblemsthatcanbe addressedatrelativelylowcost,whilealsoincreasingthespeed atwhichtheycanbesolved.Indeed,crowdscienceprojectshave resultedinanumberofhigh-profilepublicationsinscientific out-letssuchasNature,ProceedingsoftheNationalAcademyofSciences (PNAS),andNatureBiotechnology.Atthesametime,crowdscience projectsfaceimportantchallengesinareassuchasattracting con-tributorsorcoordinatingthecontributionsofalargenumberof participants.Adeeperunderstandingofthesevariousbenefitsand challengesmayallowustoassessthegeneralprospectsofcrowd science,butalsotoconjectureforwhichkindsofscientificproblems crowdsciencemaybemore–orless–suitablethanalternative modesofknowledgeproduction.

Despite thegrowingnumber of crowd science projects in a widerange offields (seeTable1 for examples),scholarlywork oncrowdscienceitselfislargelyabsent.Weaddressthislackof researchinseveralways.First,weintroducethereadertocrowd sciencebybrieflypresentingthreecasestudiesofcrowdscience projectsinbiochemistry,astronomy,andmathematics.Thesecase studiesillustratetheheterogeneityconcerningwhatcrowdscience projectsdoandhowtheyareorganized.Second,weidentify organi-zationalfeaturesthatarecommontocrowdscienceprojectswhile alsodistinguishingthemfromprojectsin“traditionalscience”and otheremergingorganizationalparadigmssuchascrowdsourcing andinnovationcontests.Third,wediscusspotentialbenefitsand challengescrowdscienceprojectsarelikelytoface,howchallenges maybeaddressed,andforwhatkindsoftasksthebenefitsand challengesmaybemostpronounced.Indoingso,webuildupon organizationaltheoriesofproblemsolvingaswellaspriorwork inareassuchasopeninnovationandopensourcesoftware devel-opment.Wethenoutlineanagendaforfutureresearchoncrowd scienceandalsodiscusshowcrowdsciencemayserveasaunique settingforthestudyofproblemsolvingmoregenerally.Finally, wediscusspotentialimplicationsforfundingagenciesandpolicy makers.

2. Examplesofcrowdscienceprojects 2.1. Foldit

Bythe1990s,scientistshaddevelopedsignificantinsightsinto thebiochemicalcompositionofproteins.However,theyhadavery limitedunderstandingof proteinstructureandshapes.Shape is importantbecauseitexplainsthewayinwhichproteinsfunction andinteractwithcells, virusesor proteinsof thehumanbody. Forexample,a suitablyshapedproteincouldblockthe replica-tionofavirus.Oritcouldsticktotheactivesiteofabiofueland catalyzeachemicalreaction.Conventionalmethodstodetermine proteinshapesincludedX-raycrystallography,nuclearmagnetic resonancespectroscopy,andelectronmicroscopy.Unfortunately, thesemethodswereextremelyexpensive,costingupto100,000 dollarsforasingleprotein,withmillionsofproteinstructuresyet tobedetermined.In2000DavidBakerandhislabattheUniversity ofWashington,Seattle,receivedagrantfromtheHowardHughes MedicalInstitutetoworkonshapedeterminationwith computa-tionalalgorithms.Researchersbelievedthatinprinciple,proteins shouldfoldsuchthattheirshapeemploystheminimumlevelof energytopreventtheproteinfromfallingapart.Thus, computa-tionalalgorithmsshouldbeabletodeterminetheshapeofaprotein basedontheelectricchargesofitscomponents.However,each portionofaproteinsequenceiscomposedofmultipleatomsand eachatomhasitsownpreferencesforbondingwithorstanding apartfromotheratoms,resulting ina largenumberof degrees offreedominasinglemolecule,makingcomputationalsolutions extremelydifficult.Bakerandhislabdevelopedanalgorithmcalled

Rosettathatcombineddeterministicandstochastictechniquesto computethelevelofenergyofrandomlychosenproteinshapes insearchofthebestresult.Afterseveralyearsofimprovements, the algorithm worked reasonably well, especially on partially determinedshapes.Becausethecomputationwasextremely inten-sive,inthefallof2005theteamlaunchedRosetta@home,agrid systemthatallowedvolunteerstomakeavailablethespare com-putationalcapacityoftheirpersonalcomputers.Acriticalfeature ofRosetta@homewasavisualinterfacethatshowedproteinsas theyfolded.Althoughthevolunteersweremeanttocontributeonly computationalpower,lookingattheRosettascreensavers,someof thempostedcommentssuggestingbetterwaystofoldtheproteins thanwhattheysawthecomputerdoing.Theseotherwisenaïve commentsinspiredgraduatestudentsattheComputerScienceand Engineeringdepartmentandpost-docsatBaker’slab.Theybeganto wonderifhumanvisualabilitycouldcomplementcomputerpower inareaswherethecomputerintelligencewasfallingshort.They developedaweb-basedgamecalledFolditthatenabledplayersto modelthestructureofproteinswiththemoveofthemouse. Play-erscouldinspectatemplatestructurefromdifferentangles.They couldthenmove,rotateorflipchainbranchesinsearchof bet-terstructures.Thesoftwareautomaticallycomputedthelevelof energyofnewconfigurations,immediatelyshowingimprovement orworsening.Certain commonstructuralproblems,suchasthe existenceofclashesorvacuumsintheprotein,werehighlightedin red,sothattheplayercouldquicklyidentifyareasofimprovement. AmenuofautomatictoolsborrowedfromRosettaenabledeasy localadjustmentsthatthecomputercoulddobetterthanhumans. Forexample,proteinscouldbewiggledorsidechainsshakenwith asimpleclickofthemouse.Thesefeatures,combinedwithafew onlinetutorials,allowedpeopletostartfoldingproteinswithout knowingvirtuallyanythingaboutbiochemistry.Mostinterestingly, thesoftwarewasdesignedas acomputergameand includeda scoreboardthatlistedplayers’performance.Assuch,players com-petedtoclimbupintherankingsandtheycouldalsosetupteams andsharestrategiestocompeteagainstotherteams.Fig.1provides animpressionoftheFolditinterface.2

ThegamewasinitiallylaunchedinMay2008.BySeptemberof thesameyearithadalreadyengaged50,000users.Theplayerswere initiallygivenknownproteinstructuressothattheycouldseethe desiredsolution.Afterafewmonthsofpractice,severalplayershad workedtheirwaytoshapesveryclosetothesolutionandinseveral caseshadoutperformedthebeststructuresdesignedbyRosetta (Cooperetal.,2010).Therewasmuchexcitementatthelaband theresearchersinvitedafewtopplayerstowatchthemplaylive. Fromtheseobservations,itbecameclearthathumanintuitionwas veryusefulbecauseitallowedplayerstomovepastthetrapsoflocal optima,whichcreatedconsiderableproblemsforcomputers.One yearafterlaunchtherewereabout200,000activeFolditplayers.

Inthefollowingmonths,thedevelopmentofRosettaandthat ofFolditproceededincombination.SomeproteinswhereRosetta wasfailingweregiventoFolditplayerstoworkon.Inexchange, playerssuggested additional automatic toolsthat they thought thecomputershouldprovideforthem.Meanwhile,playerssetup teamswithnamessuchas“AnotherHourAnotherPoint”or“Void Crushers”andusedchatsandforumstointeract.Someplayersalso began toelaboratetheirown“recipes”, encoded strategies that couldbecomparedtothosecreatedinthelab.Forexample,some strategiesinvolvedwigglingasmallpartofaprotein,ratherthan theentirestructure,otherswerebasedonfusingproteinpartsor

2TheFolditcasestudyisbasedonthefollowingwebsources:http://fold.it; http://www.youtube.com/watch?v=2adZW-mpOk; http://www.youtube.com/ watch?v=PE048WhCCA; http://www.youtube.com/watch?v=nfxGnCcx9Ag;

(3)

C. Franzoni, H. Sauermann / Research Policy 43 (2014) 1– 20 3

Examplesofcrowdscienceprojects.

Name URL Hostingplatform Field Primarytasks

AncientLives http://ancientlives.org Zooniverse Archeology InspectfragmentsofEgyptianpapyri andtranscribecontent

Argus http://argus.survice.com/ None Cartography Measureseabeddepthduring navigation

BatDetective http://www.batdetective.org/ Zooniverse Biology Listentobatcallsandclassifythem Connect2Decode

(C2D)*

https://sites.google.com/a/osdd.net/c2d-01/ None Genetics AnnotategenesofMycobacterium tuberculosis

Connect2Decode: Cloning,Expression andPurificationof Proteins

http://c2d.osdd.net/ Connect2Decode Genetics ClonegenesinMycobacterium tuberculosis

Connect2Decode: Cheminformatics andChemicalData Mining

http://c2d.osdd.net/home/cheminformatics Connect2Decode Genetics Usechemicaldescriptorsanddata miningtoidentifyandannotate moleculeswithdesirableproperties CrowdComputingfor

Cheminformatics

http://vinodscaria.rnabiology.org/2C4C CSIRNetworkprojecton ncRNAbiology

Bioinformatics Developpredictivemodelsfor biologicalactivitiesofchemical molecules

Cheminformatics CrowdComputing forTubercolosisDrug Discovery

http://vinodscaria.rnabiology.org/cheminformatics-crowd-computing-for-tuberculosis CSIRNetworkprojecton ncRNAbiology

Bioinformatics Annotatedatasetofactive anti-tubercularmolecules,byusing existingorimprovedprioritization algorithms,toidentifyasmallsubsetof drugcandidates

OpenDinosaurProject http://opendino.wordpress.com/about/ None Paleontology Searchinpublishedliteratureand reportdinosaurlimblengthand/or measureandreportfrommuseum specimens

CycloneCenter http://www.cyclonecenter.org/ Zooniverse Climatology Watchsatelliteimagesofcyclonesand classifystorms

Discoverlife http://www.discoverlife.org/ None Biology Observeandreportimagesand locationofwildspecies eBird http://www.ebird.org/ CornellLabofOrnithology Biology Observeandreportimagesand

locationofwildbirds

EteRNA http://eterna.cmu.edu/ None Biochemistry Game.ModifytheshapeofRNAbases combinationtofitafoldedstructure Foldit http://www.fold.it None Biochemistry Game.Modifyavisual3Dmodelof

proteintooptimizeitsshape GalaxyZoo http://www.galaxyzoo.org/ Zooniverse Astronomy Inspectandclassifyimagesofgalaxies GreatBackyardBird

Count

http://birdsource.org/gbbc/ CornellLabofOrnithology Biology Reportfrequencyofbirdsightings occurringduring15minperiods GreatSunflower

Project

http://www.greatsunflower.org/ None Biology Growflowersthatattractbeesand otherpollinationinsects,observeand reportfrequencyofinsectsvisits IceHunters* http://www.icehunters.org Zooniverse Astronomy Inspecttelescopeimagesandidentify

possibletargetsforNewHorizons mission

MilkywayProject http://www.milkywayproject.org Zooniverse Astronomy Inspecttelescopeimages,identify cloudsandbubbles

MoonZoo http://www.moonzoo.org/ Zooniverse Astronomy InspectsatelliteimagesoftheMoon andreportcraters

Nestwatch http://nestwatch.org CornellLabofOrnithology Biology Observenestactivities(eggs,young, fledglings)andreportcount NotesfromNature http://www.notesfromnature.org/ Zooniverse Biology Inspectimagesofspecimensdigitized

fromnaturalhistorymuseumsand transcribetheirdescriptions

(4)

C. Franzoni, H. Sauermann / Research Policy 43 (2014) 1– 20 Table1(Continued)

Name URL Hostingplatform Field Primarytasks

OldWeather http://www.oldweather.org Zooniverse Climatology InspectimagesoflogbooksofRoyal Navyshipsorarticexplorationships andtranscribeclimateandlocation reports

Patientslikeme http://www.patientslikeme.com/ None Medicine Inputpersonaldataaboutsymptoms andtreatmentofdiseases

PigeonWatch* http://www.birds.cornell.edu/pigeonwatch CornellLabofOrnithology Biology Observepigeonspeciesandreport

imagesand/orlocationofobservation Phylo http://phylo.cs.mcgill.ca None Genetics Game.Inspectsequenceofcolored

shapesandmoveshapestooptimize theiralignment

PlanetHunters http://www.planethunters.org Zooniverse Astronomy Inspectstarlightcurvesregisteredby theKeplerspacecraftandreport possibleplanettransits Polymath1.Density

Hales–Jewett*

http://gowers.wordpress.com/2009/02/01/a-combinatorial-approach-to-density-hales-jewett/ None Mathematics Solveamathematicalproblem Polymath2.Banach

Spaces*

http://gowers.wordpress.com/2009/02/17/must-an-explicitly-defined-banach-space-contain-c0-or-ellp/ None Mathematics Solveamathematicalproblem Polymath3:

PolynomialHirsh Conjecture*

http://gilkalai.wordpress.com/2009/07/17/the-polynomial-hirsch-conjecture-a-proposal-for-polymath3/ None Mathematics Proveamathematicalconjecture

Polymath4: Deterministicwayto findprimes*

http://polymathprojects.org/2009/07/27/proposal-deterministic-way-to-find-primes/ Polymathprojects Mathematics Solveamathematicalproblem

Polymath5:Erdos Discrepancy*

http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/ Polymathprojects Mathematics Solveamathematicalproblem Polymath6:Improving

theboundsforRoth’s theorem

http://polymathprojects.org/2011/02/05/polymath6-improving-the-bounds-for-roths-theorem/ Polymathprojects Mathematics Solveamathematicalproblem

Polymath7:HotSpots Conjecture

http://polymathprojects.org/2012/06/03/polymath-proposal-the-hot-spots-conjecture-for-acute-triangles/ Polymathprojects Mathematics Proveamathematicalconjecture Polymath8:Bounded

gapsbetweenprimes

http://polymathprojects.org/2013/06/04/polymath-proposal-bounded-gaps-between-primes/ Polymathprojects Mathematics Solveamathematicalproblem ProjectFeederWatch http://www.birds.cornell.edu/pfw/ CornellLabofOrnithology Biology Installabirdfeederinbackyard,

observeandreportfrequencyofbird visits

SeafloorExplorer http://www.seafloorexplorer.org/ Zooniverse Biology Inspectseafloorimages,identifyand reporttargetspecies

Setilive http://setilive.org/ Zooniverse Astronomy Inspectliveradiofrequencysignals broadcastedbytheSETIInstitute’s AllenTelescope.Consistent simultaneousreportsofpotential extraterrestrialsignalsprompta secondtelescopeinspectionwithin minutes

SecchiApp https://www1.plymouth.ac.uk/marine/secchidisk/Pages/default.aspx None Biology Measuretheturbidityofseawater usingaSecchidiskanduseamobile apptoreportthecorresponding concentrationofphytoplankton SnapshotSerengeti http://www.snapshotserengeti.org/ Zooniverse Biology InspectpicturesoftheSerengetiLion

Projectandclassifyimagesofwildlife SolarStormwatch http://www.solarstormwatch.com Zooniverse Astronomy Watchvideosofsolaractivityand

reportinceptionandlengthofsolar explosions

(5)

Space NEEMO * https://www.zooniverse.org/lab/neemo Zooniverse Biology Inspect underwater images, identify and classify marine life Spacewraps https://www.spacewraps.org Zooniverse Astronomy Inspect telescope images and report possible gravitational lenses Stardust@home http://stardustathome.ssl.berkeley.edu/ None Astronomy Inspect magnified images of pristine space material and report possible stardust particles Sun Lab http://www.pbs.org/wgbh/nova/labs/lab/sun/ NovaLabs Astronomy Inspect images of sun, report and classify sun spots Whale Song http://whale.fm Zooniverse Zoology Listen to underwater recordings of whale chants from ships and report if two chants are similar What’s the score http://www.whats-the-score.org/ Zooniverse Music Inspect pictures of musical score covers from the Bodleian Libraries and transcribe elements in covers Yardmap Network http://www.yardmap.org/ Cornell Lab of Ornithology Biology Report frequency of bird sightings in own backyard; report backyard characteristics and conditions *Projects completed or inactive as of July, 2013.

bandingchains.Complexstrategiesarenow oftenembeddedin tools(“recipes”)thatcansubsequentlybedownloadedandused byothers.3

Theresultswerestriking.A playerstrategycalled“Bluefuse” completelydisplaced Rosetta“ClassicRelax”,andoutperformed “FastRelax”,apieceofcodethattheRosettadevelopershadworked onforquitealongtime.TheseresultswerepublishedinPNASand theplayersofFolditwereco-authorsunderacollectivepseudonym (Khatibet al., 2011a).In December 2010,encouraged by these results,FirasKhatib,FrankDiMaioinBaker’slabandSethCooper inComputerScienceandEngineering,amongothersworkingon Foldit,thoughtthattheplayerswerereadyforareal-world chal-lenge.Consultingwithagroupofexperimentalists,theychosea monomericretroviralprotease,aproteinknowntobecriticalfor monkey-virusHIVwhosestructurehadpuzzledcrystallographers foroveradecade.Threeplayersfromthe“Contenders”Folditteam cametoafairlydetailedsolutionoftheproteinstructureinlessthan threeweeks(publishedasKhatibetal.,2011b).AsofSeptember 2012,Folditplayerswereco-authorsoffourpublicationsintop journals.

2.2. GalaxyZoo

In January2006,a StardustSpacecraft capsulelandedin the Utahdesertwithprecioussamples ofinterstellar dustparticles afterhavingencounteredthecometWild2.Particlesinthe sam-plewereastinyasamicron,andNASAtook1.6millionautomated scanningmicroscopeimagesoftheentiresurfaceofthecollector. Becausecomputersarenotparticularlygoodatimagedetection, NASAdecidedtouploadtheimagesonlineandtoaskvolunteers tovisuallyinspectthematerialandreportcandidatedust parti-cles.Theexperiment,knownasStardust@Home,hadalargeecho inthecommunityofastronomers,wheretheinspectionoflarge collectionsofimagesisacommonproblem.4

In2007,KevinSchawinski,thenaPhDinChrisLintott’sgroup attheUniversityofOxford,thoughtaboutusingthesamestrategy, althoughthegroup’sproblemwasdifferent.Inordertounderstand theformationofgalaxies,theyweresearchingforthose ellipti-calgalaxiesthathadformedstarsmostrecently.Inthespringof 2007theirinsightswere basedona limitedsample ofgalaxies thatSchawinskihadcodedmanually,butmoredatawereneeded toreallyprovetheirpoint.Afewmonthsearlier,theSloan Dig-ital Sky Survey (SDSS) had made available 930,000 pictures of distantgalaxiesthatcouldprovidethemwithjusttheraw mate-rialtheyneededfortheirwork.Tobeabletoprocessthislarge amountofdata,theresearcherscreatedtheonlineplatformGalaxy Zoo inthesummerof 2007.Volunteerswere askedtosignup, readanonlinetutorial,andthencodesixdifferentpropertiesof astronomicalobjectsvisibleinSDSSimages.Participationbecame quicklyviral,partlybecausetheimageswerebeautifultolookat, and partlybecausetheBBC publicizedtheinitiative in itsblog. Beforetheprojectstarted,thelargestpublishedstudywasbased on3000galaxies.Sevenmonthsaftertheprojectwaslaunched, about900,000 galaxieshad beencodedand multiple classifica-tionsofagivengalaxybydifferentvolunteerswereusedtoreduce theincidenceofincorrectcoding,foratotalofroughly50million classifications.Foranindividualscientist,50millionclassifications

3http://foldit.wikia.com/wiki/Recipes;retrievedMarch29,2013.

4The Galaxy Zoo description is based on Nielsen (2011) and

on the following web sources: http://www.galaxyzoo.org/story;

http://supernova.galaxyzoo.org/; http://mergers.galaxyzoo.org/; http://www. youtube.com/watch?v=jzQIQRr1Bo&playnext=1&list=PL5518A8D0F538C1CC& feature=resultsmain; http://data.galaxyzoo.org/; http://zoo2.galaxyzoo.org/;

http://hubble.galaxyzoo.org/; http://supernova.galaxyzoo.org/about#supernovae; retrievedSeptember24,2012.

(6)

Fig.1.Foldituserinterface.Source:http://fold.it/portal/info/about.

wouldhaverequiredmorethan83yearsoffull-timeeffort.5The GalaxyZoodataallowedLintott’steamtosuccessfully complete thestudytheyhadinitiallyplanned.However,thiswasjustthe beginningofGalaxyZoo’scontributionstoscience,partlybecause participantsdidnotsimplycodegalaxies–theyalsodevelopedan increasingcuriosityforwhattheysaw.ConsiderthecaseofHanny vanArkel,aDutchschoolteacherwhointhesummerof2007 spot-tedanunusual“voorwerp”(Dutchfor “thing”)thatappearedas anirregularly-shapedgreencloudhangingbelowagalaxy.After herfirstobservation,astronomersbegantopointpowerful tele-scopestowardthecloud.Itturnedoutthatthecloudwasaunique object that astronomers now explain as being an extinguished quasarwhoselightechoremainsvisible.Zooitesalsoreportedother unusualgalaxiesfortheircolororshape,suchasverydensegreen galaxiesthattheynamed“Greenpeagalaxies”(Cardamoneetal., 2009).Akeywordsearchfor“Hanny’sVoorwerp”inWebof Knowl-edgeas ofJanuary 2013showedeight publishedpapers, and a searchfor“Greenpeagalaxies”gavesix.6

ThecodedGalaxyZoodataweremadepubliclyavailablefor furtherinvestigationsin2010.AsofSeptember2012,therewere 141scientificpapersthatquotethesuggestedcitationforthedata release,36ofwhicharenotcoauthoredbyLintottandhisgroup. AfterthesuccessofthefirstGalaxyZooproject,GalaxyZoo2was launchedin2009tolookmorecloselyatasubsetof250,000 galax-ies.GalaxyZooHubblewaslaunchedtoclassifyimagesofgalaxies madeavailablebyNASA’sHubbleSpaceTelescope.Otherprojects lookedatsupernovaeandatmerginggalaxies.Threeyearsafter GalaxyZoostarted,250,000Zooiteshadcompletedabout200 mil-lionclassifications,andcontributorsarecurrentlyworkingonthe latestandlargestreleaseofimagesfromtheSDSS.

ThesuccessofGalaxyZoosparkedinterestinvariousareasof sci-enceandthehumanities.In2009,Lintottandhisteamestablished acooperationwithotherinstitutionsintheUKandUSA(theCitizen ScienceAlliance)torunanumberofprojectsonacommon plat-form“TheZooniverse”.TheZooniverseplatformreceivedfinancial

5 Schawinskiestimatedhismaximuminspectionrateasbeing50,000coded

galaxiesper month. http://www.youtube.com/watch?v=jzQIQRr1Bo&playnext= 1&list=PL5518A8D0F538C1CC&feature=resultsmain; retrieved September 21, 2012.

6 Thisisanincompletecountbecausemanyarticlesrefertotheseobjectsby

dif-ferentnames.AcitationsearchoftheCardamoneetal.(2009)paperintheSAO/NASA AstrophysicsDataSystemreportedover40entries.

supportfromtheAlfredP.SloanFoundationin2011andcurrently hostsprojectsinfieldsasdiverseasastronomy,marinebiology, climatologyandmedicine.Recentprojectshavealsoinvolved par-ticipantsinabroadersetoftasksandincloserinteractionwith machines.For example, contributors tothe Galaxy Supernovae projectwereshownpotentiallyinteresting targetsidentifiedby computersatthePalomarTelescope.Thecontributorsscreenedthe largenumberofpotentialtargetsandselectedasmallersubsetthat seemedparticularlypromisingforfurtherobservation.This itera-tiveprocesspermittedastronomerstosavepreciousobservation timeatlargetelescopes.Lintottthinksthatinthefuturevolunteers willbeusedtoprovidereal-timejudgmentswhencomputer pre-dictionsareunreliable,combiningartificialandhumanintelligence inthemostefficientway.7Inanewproject(GalaxyZooQuench), contributorsparticipateinthewholeresearchprocessby classi-fyingimages,analyzingdata,discussingfindings,andcollectively writinganarticleforsubmission.

2.3. Polymath

TimothyGowersisaBritishmathematiciananda1998Fields Medal recipient for his work on combinatorics and functional analysis.8Aneclecticpersonalityandanactiveadvocateof open-ness in science, he keeps a regular blog that is well read by mathematicians.OnJanuary29,2009hepostedacommentonhis blogsayingthathewouldliketotryanexperimenttocollectively solveamathematicalproblem.Inparticular,hestated:

Theidealoutcomewouldbeasolutionoftheproblemwithno singleindividualhavingtothinkallthathard.[...]Sotrytoresist

7http://www.youtube.com/watch?v=jzQIQRr1Bo&playnext=1&list= PL5518A8D0F538C1CC&feature=resultsmain;retrievedSeptember21,2012.

8ThePolymathcasestudyisbasedonNielsen(2011)aswellasthefollowing

sources: http://gowers.wordpress.com/2009/01/27/is-massively-collaborative-mathematics-possible/ http://gowers.wordpress.com/2009/01/30/background-to-a-polymath-project/; http://gowers.wordpress.com/2009/02/01/a-combinatorial-approach-to-density-hales-jewett/; http://gowers.wordpress.com/2009/03/10/ problem-solved-probably/; http://mathoverflow.net/questions/31482/the-sensitivity-of-2-colorings-of-the-d-dimensional-integer-lattice. http://gilkalai. wordpress.com/2009/07/17/the-polynomial-hirsch-conjecture-a-proposal-for-polymath3/; http://en.wordpress.com/tag/polymath4/; http://gowers.wordpress. com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/;

http://gowers.wordpress.com/2009/12/28/the-next-polymath-project-on-this-blog/;retrievedSeptember,2012.

(7)

thetemptationtogoawayandthinkaboutsomethingandcome backwithcarefullypolishedthoughts:justgivequickreactions towhatyouread[...],explainbriefly,butaspreciselyasyou can,whyyouthinkitisfeasibletoanswerthequestion.9 Inthenextfewhours,severalreaderscommentedonhisidea. Theyweregenerallyinfavoroftryingtheexperimentandbegan todiscusspracticalissueslikewhetherornotablogwastheideal formatfortheproject,andiftheoutcomeshouldbeapublication withasinglecollectivename,orratherwithalistofcontributors. Encouragedby thepositivefeedback, Gowerspostedtheactual problemonFebruary1st:acombinatorialprooftothedensity ver-sionoftheHales–Jewetttheorem.The discussionthat followed spanned6weeks(seeFig.2foranexcerpt).

Among the contributors were several university professors, includingTerryTao,atop-notchmathematicianatUCLAandalso aFieldsMedalist,aswellasseveralschoolteachersandPhD stu-dents.Afterafewdays,thediscussionhadbranchedoutintoseveral threadsandawikiwascreatedtostoreargumentsandideas. Cer-tain contributors weremore active than others, but significant progresswascomingformvarioussources.OnMarch10,Gowers announcedthattheproblemwasprobablysolved.Heandafew colleaguestookonthetaskofverifyingtheworkanddraftinga paper,andthearticlewaseventuallypublishedintheAnnalsof Mathematicsunderthepseudonym“D.H.J.Polymath”(Polymath, 2012).

ThrilledbythesuccessoftheoriginalPolymathproject, sev-eralofGowers’colleagueslaunchedsimilarprojects,thoughwith varying degrees of success. In June 2009, Terence Tao orga-nized a collaborative entry to the International Mathematical Olympiadtakingplaceannuallyinthesummer.Similarprojects havebeensuccessfullycompletedeveryyearandareknownas Mini-Polymath projects.Scott Aaronson begana projectonthe “sensitivityof 2-coloringsof thed-dimensional integerlattice”, whichwasactiveforoverayear,butdidnotgettoafinal solu-tion.GilKalaistartedaprojectonthe“PolynomialHirshconjecture” (Polymath3),againwithinconclusiveresults.TerenceTaolaunched aprojectforfindingprimesdeterministically(Polymath4),which wassuccessfullycompletedandledtoapublicationinMathematics ofComputation(Taoetal.,2012).10InJanuary2010Timothy Gow-ersandGilKalaibegancoordinatinganewPolymathprojectonthe ErdosDiscrepancyProblem(knownasPolymath5).Aninteresting aspectofthisprojectisthattheparticularproblemtobesolvedwas chosenthroughapublicdiscussiononGowers’blog.Several Poly-mathprojectsarecurrentlyrunning.Despitethegrowingnumber ofprojects,however,thenumberofcontributorstoeach particu-larprojectremainsrelativelysmall,typicallynotexceedingafew dozen.11

Itisinterestingtonotethatovertime,Polymathprojectshave developedorganizationalpracticesthatfacilitatecollective prob-lemsolving.Inparticular,acommonchallengeisthatwhenthe discussiondevelopsintohundredsofcomments,itbecomes dif-ficultforcontributorstounderstandwhichtracksarepromising, which have been abandoned, and where the discussion really stands.Thechronologicalorderofcommentsisnotalways infor-mativebecausepeoplerespondtodifferentpostsandtheproblems branchoutinparallelconversations,makingitdifficulttocontinue

9 http://gowers.wordpress.com/2009/01/27/is-massively-collaborative-mathematics-possible/,retrievedSeptember,2012.

10WhileoriginallysubmittedunderthepseudonymD.H.J.Polymath,thispaper

endedupbeingpublishedwithaconventionalsetofauthorsattheinsistenceofthe journaleditors,whofeltthatthepseudonymwouldconflictwithbibliographical record-keeping(emailcommunicationwithTerenceTao,May22,2013).

11 http://michaelnielsen.org/blog/the-polymath-project-scope-of-participation/;

retrievedSeptember28,2012.

discussions ina meaningful way. Toovercometheseproblems, Gowers,Taoandotherprojectleadersstartedtotakeontheroleof moderators.Whenthecommentsonatopicbecometoomanyor toounfocused,theysynthesizethelatestresultsinanewpostor openseparatethreads.Polymathalsoinspiredmathoverflow.net, ausefultoolforthebroaderscientificcommunity.Launchedinthe fallof2009,thisplatformallowsmathematicianstopostquestions orproblemsandtoprovideanswersorrateothers’answersina commentedblog-stylediscussion. Insomecases,thediscussion developsinwayssimilartoasmallPolymathprojectandanswers havealsobeencitedinscholarlyarticles.

2.4. Overviewofadditionalcrowdscienceprojects

Table1showsadditionalexamplesofcrowdscienceprojects. Thetablespecifiestheprimaryscientificfieldofaproject, illus-tratingthebreadthofapplicationsacrossfieldssuchasastronomy, biochemistry,mathematics,orarcheology.Wealsoindicatewhat kindoftaskisperformedbythecrowd,e.g.,theclassificationof imagesandsounds,thecollectionofobservationaldata,or collec-tiveproblemsolving.Thiscolumnshowsthevarietyoftasksthat canbeaccomplishedincrowdscienceprojects.Finally,Table1 indi-cateswhichprojectsarepartoflargercrowdscienceplatforms.We willdrawontheearliercasesandtheexampleslistedinTable1 throughoutoursubsequentdiscussion.

3. Characterizingcrowdscienceandexploring heterogeneityacrossprojects

Examplessuchasthosediscussedinthepriorsectionprovide fascinatinginsightsintoanemergingwayofdoingscienceandhave intriguedmanyobservers.However,whilethereisagreementthat theseprojectsaresomehow“different”fromtraditionalscience, asystematicunderstandingoftheconcretenatureofthese differ-encesislacking.Similarly,itseemsimportanttoconsidertheextent towhichtheseprojectsdifferfromotheremergingapproachesto producingknowledgesuchascrowdsourcingorinnovation con-tests.

InthefollowingSection3.1,weidentifytwokeyfeaturesthat distinguishcrowdsciencefromotherregimesofknowledge pro-duction: openness in project participation and openness with respecttothedisclosureofintermediate inputssuchasdataor problemsolvingapproaches.Notallprojectssharethesefeatures tothesamedegree;however,thesedimensionstendtodistinguish crowdsciencefromotherorganizationalforms,whilealsohaving importantimplicationsforopportunitiesandchallengescrowd sci-enceprojectsmayface.InthesubsequentSection3.2,wewilldelve more deeply into heterogeneityamong crowd scienceprojects, reinforcingthenotionthat“crowdscience”isnotawell-defined typeofprojectbutratheranemergingorganizationalmodeofdoing sciencethatallowsforsignificantexperimentation,aswellas con-siderablescopeinthetypesofproblemsthatcanbeaddressedand inthetypesofpeoplewhocanparticipate.

3.1. Puttingcrowdscienceincontext:Differentdegreesof openness

A firstimportant featureof crowdsciencethat marksa dif-ference totraditional scienceisthat participationin projectsis opentoalargenumberofpotentialcontributorsthataretypically unknowntoeachotherortoprojectorganizersatthebeginning ofaproject.Anyindividualwhoisinterestedinaprojectisinvited tojoin.RecallthatFieldsMedalistTimothyGowers’invitationto joinPolymathwasaccepted,amongtheothers,byTerenceTao, anotherFieldsMedalistworkingatUCLA,aswellasanumberof

(8)

Fig.2. ExcerptfromPolymathdiscussion.Source:http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/.

otherless famouscolleagues,schoolteachers,and graduate stu-dents.OpenparticipationisevenmoresalientinGalaxyZoo,which recruitsparticipantsonitswebsiteandboastsacontributorbase ofover250,000.Notethattheemphasishereisnotsimplyona largenumberofprojectparticipants(largeteamsizeisbecoming increasinglycommonevenintraditionalscience,seeWuchtyetal. (2007)).Rather, openparticipation entailsvirtually unrestricted “entry”byanyinterestedandqualifiedindividual,oftenbasedon self-selectioninresponsetoageneralcallforparticipation.

Asecondfeaturethattendstobecommonisthatcrowd sci-enceprojectsopenlydiscloseasubstantialpartoftheintermediate inputsusedintheknowledgeproductionsuchasdatasetsor prob-lemsolvingapproaches.TheWhaleSongProject,forexample,has uploadedadatabaseofaudiorecordingsofwhalesongs,andGalaxy Zoohasmadepubliclyavailabletheclassificationsmadeby volun-teercontributors(Lintottetal.,2010).Asnotedearlier,manyFoldit playersalsosharetheirtrickstofacilitatecertainoperations,and complicatedstrategiesareencodedindownloadablescriptsand recipes.Thus,muchoftheprocess-relatedknowledgethathas tra-ditionallyremainedtacitinscientificresearch(Stephan,1996)is

codifiedandopenlysharedwithinandoutsideaparticularproject. Wealsosubsumeundertheterm“intermediateinputs”recordsof theactualprocessthroughwhichprojectparticipantsdevelop, dis-cuss,andevaluateproblemsolvingstrategiessincetheresulting insightscanfacilitatefutureproblemsolving.Consider,for exam-ple,thefollowingdiscussionofaproblemsolvingapproachfrom thePolymath4blog:

Anonymous(August9,2009at5:48pm):Tim’sideasonsumsets oflogsgotmetothinkingaboutarelated,butdifferentapproach tothesesortsof“spacingproblems”:saywewanttoshowthat nointerval[n,n+logn]containsonly(logn)ˆ100–smooth numbers.Ifsuchstrangeintervalsexist,thenperhapsonecan showthatthereareloadsofdistinctprimespin[(logn)ˆ10, (logn)ˆ100]thateachdividesomenumberinthisinterval[...] Terence Tao (August 9, 2009 at 6:00pm): I quite like this approach–itusestheentiresetSofsumsofthe1/pi,rather thanafinitesumsetofthelog-integersorlog-primes,andso shouldinprinciplegetthemaximalboostfromadditive com-binatorialmethods.It’salsousingthespecificpropertiesofthe

(9)

Fig.3. Knowledgeproductionregimeswithdifferentdegreesofopenness.

integersmoreintimatelythanthelogarithmicapproach,which isperhapsapromisingsign.12

Wesuggestthatthesetwodimensions–opennessin participa-tionandopennesswithrespecttointermediateinputs–donotonly describecommonfeaturesofcrowdscienceprojectsbutthatthey alsodifferentiatecrowdscienceprojectsfromothertypesof knowl-edgeproduction.13Inparticular,Fig.3showshowdifferencesalong thesetwodimensionsallowustodistinguishinanabstractway crowdsciencefromthreeotherknowledgeproductionregimes.

InFig.3,crowdsciencecontrastsmoststronglywiththe bottom-leftquadrant,whichcapturesprojectsthatlimitcontributionstoa relativelysmallandpre-definedsetofindividualsanddonotopenly discloseintermediateinputs.Wecallthisquadrant“traditional sci-ence”becauseitcaptureskeyfeaturesofthewaysciencehasbeen doneoverthelastcentury.Ofcourse,traditionalscienceisoften called“open”sciencebecausefinalresultsareopenlydisclosedin theformofpublications(MurrayandO’Mahony,2007;David,2008; SauermannandStephan,2013).However,whiletraditionalscience is“open”inthatsense,itislargelyclosedwithrespecttothe dimen-sionscapturedinourframework.Inparticular,researchersretain exclusiveuseoftheirkeyintermediate inputs,suchasthedata andtheheuristicsusedtosolveproblems.While someofthese inputsaredisclosedinmethodologysectionsofpapers,mostof themremaintacitoraresharedonlyamongthemembersofthe researchteam(Stephan,1996).

Thislackofopennessfollowsfromthelogicoftherewardsystem ofthetraditionalinstitutionofscience.AsemphasizedbyMerton inhisclassicanalysis,traditionalscienceplacesonekeygoalabove allothers:gainingrecognitioninthecommunityofpeersbybeing thefirsttopresentorpublishnewresearchresults(Merton,1973; Stephan,2012).Whilescientistsmayalsocareaboutothergoals, publishingandtheresultingpeerrecognitionarecriticalbecause theycanyieldindirectbenefitssuchasjobsecurity(tenure),more resources for research, more grants, more students, and so on (LatourandWoolgar,1979;SauermannandRoach,2013).Since mostoftherecognitiongoestothepersonwhoisfirstin discover-ingandpublishingnewknowledge,therewardsystemofscience inducesscientiststoexpendgreateffortandtodiscloseresearch

12 http://polymathprojects.org/2009/08/09/research-thread-ii-deterministic-way-to-find-primes/;retrieved29March2012.

13 Openparticipationandopendisclosureofinputsarealsocharacteristicofopen

sourcesoftware(OSS)development,althoughthegoalofthelatterisnotthe produc-tionofscientificknowledgebutofsoftwareartifacts.Giventhesesimilarities,our discussionwilltakeadvantageofpriorworkthathasexaminedtheorganizationof knowledgeproductioninOSSdevelopment.

resultsasquicklyaspossible.Atthesametime,thissystem encour-agesscientiststoseekwaystomonopolizetheirareaofexpertise andtobuildacompetitiveadvantageoverrivals.Assuch,the tradi-tionalrewardsystemofsciencediscouragesscientistsfromhelping contenders,explainingwhydata,heuristics,andproblemsolving strategiesareoftenkeptsecret(DasguptaandDavid,1994;Walsh etal.,2005;Haeussleretal.,2009).

Recognizingthatsecrecy withrespecttointermediateinputs mayslowdowntheprogressofscience,fundingagencies increas-inglyaskscientiststoopenlydisclosedataandlogsasacondition of funding.The NationalInstitutesofHealth andtheWellcome Trust,forexample,requirethatdatafromgeneticstudiesbemade openlyavailable.Similarly,moreandmorejournalsincludingthe flagshippublicationsScienceandNaturerequireauthorsto pub-liclydepositdataandmaterialssuchthatanyinterestedscientist canreplicateorbuilduponpriorresearch.14Assuch,anincreasing numberofprojectshavemovedfromquadrant4–“traditional sci-ence”–intoquadrant3(bottomright).Eventhoughprojectsinthis quadrantaremoreopenbydisclosingintermediateinputsafterthe projectisfinished,participationinagivenprojectremainsclosed tooutsiders.

Letusfinallyturntoprojectsin quadrant1(topleft),which solicitcontributionsfromawiderangeofparticipantsbutdonot publiclydiscloseintermediateinputs.Moreover,whilenot explic-itlyreflectedin Fig.3,projectsin thiscelldifferfromtheother threecellsinthatevenfinalprojectresultsaretypicallynotopenly disclosed. Examples of this organizing mode include Amazon’s MechanicalTurk(whichpaysindividualsforperformingtaskssuch as collectingdatafromwebsites) aswell asincreasingly popu-lar innovationcontestplatformssuchas Kaggleor Innocentive. Intheseinnovationcontests,“seekers”posttheirproblemsonline andoffermonetaryprizesforthebestsolutions.Anybodyisfreeto competebysubmittingsolutions,althoughmanycontestsrequire that“solvers”firstacceptconfidentialityandintellectualproperty agreements.Onceawinnerisdetermined,heorshewilltypically transferthepropertyrightofthesolutiontotheseekerinreturn fortheprize(JeppesenandLakhani,2010).Thus,theprojectsin quadrant1areopentoalargepoolofpotentialcontributors,but theydonotdiscloseresultsanddata.Amainreasonforthe lat-ter isthat projectsponsorsareoftenprivateorganizationsthat seektogainsomesortofacompetitiveadvantagebymaintaining uniqueaccesstoresearchresultsandnewtechnologies.Alongwith

14http://www.sciencemag.org/site/feature/contribinfo/prep/geninfo.xhtml# dataavail; http://www.nature.com/authors/policies/availability.html; retrieved October20,2011.

(10)

crowd science,however, innovation contests and crowd sourc-ingplatformsarehighlydynamic,witnessing therapidentryof newmodelsandsignificantexperimentationwithdifferent orga-nizational designs.As such, ourclassification focuses on broad differencesbetweenquadrantsandoncommoncharacteristicsof examplesasofthewritingofthisarticle.Therearemanyinteresting nuancesinthedesignsofcrowdsourcingplatformsandinnovation contestsandsomearemore“open”thanothers(i.e.,somemaybe locatedclosertoquadrant2).15Thus,adeeperunderstandingof theimplicationsofcrowdscience’shighdegreesofopennessmay beusefulforfutureworkexaminingheterogeneityandchangesin opennessamongcontestsandcrowdsourcingplatformsaswell.

Whileopennesswithrespecttoprojectparticipationandwith respecttointermediateinputsarebynomeanstheonly interest-ingaspectsofcrowdscienceprojects,andwhilenotallprojects reflectthesefeaturestothesamedegree,thesetwodimensions highlightprominentqualitativedifferencesbetweencrowdscience andotherregimesofknowledgeproduction.Moreover,wefocuson thesetwodimensionsbecausetheyhaveimportantimplications forourunderstandingofpotentialbenefitsandchallengesofcrowd science.Ashighlightedbypriorworkontheorganizationof prob-lemsolving,thefirstdimension–openparticipation–isimportant becauseitspeakstothelaborinputsandknowledgeaprojectcan drawon,andthustoitsabilitytosolveproblemsandtogenerate newknowledge.16Opennesswithrespecttointermediateinputs ourseconddimension–isanimportantrequirementfordistributed workbyalargenumberofcrowdparticipants.Atthesametime, ourdiscussionoftheroleofsecrecyintraditionalsciencesuggests thatthisdimensionmayhavefundamental implicationsforthe kindsofrewardsprojectcontributorsareabletoappropriateand thusforthekindsofmotivesandincentivesprojectscanrelyonin attractingparticipants.Weprovideamoredetaileddiscussionof thebenefitsandchallengesresultingfromopenparticipationand opendisclosureofintermediateinputsinSections4and5. 3.2. Heterogeneitywithin:Differencesinthenatureofthetask andincontributorskills

Section 3.1 highlighted two important common features of crowdscienceprojects.However,thereisalsoconsiderable het-erogeneityamong projects. Inseeking deeperinsightsinto this heterogeneity,wefocusontwodimensionsthatmayhave particu-larlyimportantimplicationsforthebenefitsandchallengescrowd scienceprojectsface.AsreflectedinFig.4,thesedimensionsinclude thenatureofthetaskthatisperformedbythecrowd (horizon-talaxis),andtheskillsthatprojectparticipantsneedinorderto makemeaningfulcontributions(verticalaxis).17Wediscusseach dimensioninturn.

Inageneralsense,allcrowdscienceprojectsultimatelyhave thegoaltogainadeeperunderstandingofthenaturalworldandto findsolutionstoscientificproblems.Aspartofaproject,thecrowd

15 Forexample,theNetflixprizecompetitioninvolvedonlyanon-exclusivelicense

(http://www.netflixprize.com/faq#signup,accessedJuly5,2013).

16 Theliterature onopeninnovationandthegovernanceofproblem-solving

activitiesfocusesonthedecisiontolocateproblemsolvingwithinagiven orga-nizationalboundary(“closed”)versusoutsidetheorganization(“open”)(Nickerson andZenger,2004;JeppesenandLakhani,2010;AfuahandTucci,2012;Felinand Zenger,2012).Whilethispriorworkcentersonfirmboundaries,manyofthe result-inginsightsareusefulinourcontextifweconceptualizetherelevantorganizational unitnotasthefirmbutasthe“core”teamofresearchersthatwouldworkona prob-lemintraditionalsciencebutnowhastheoptiontoinviteparticipationbythelarger crowd.

17 AswewilldiscussinSection5.1.3,mostprojectshavededicatedproject

organiz-erswhoofteninitiatetheprojectsandcloselyresembletheprincipalinvestigators intraditionalscientificprojects.Ourfocushereisonthetypicallymuchlargergroup ofcontributorsoutsideoftheimmediate“core”ofprojectorganizers.

performscertaintypesoftaskssuchasthecodingofastronomical data,theoptimizationofproteinstructures,orthedevelopment ofcomplexmathematicalproofs.Notethatweconceptualizethe “task”inanaggregatesenseatthelevelofthecrowd,witheach individualmakingdistinctcontributionstothattaskbyengaging incertainsubtasks.Thenatureofthetasksoutsourcedtothecrowd differsinmany ways,but priorresearchontheorganizationof problemsolvingsuggeststhatitisparticularlyusefultoconsider differencesintworelatedcharacteristics:thecomplexityofthetask andthetaskstructure(Simon,1962,1973;FelinandZenger,2012). In ourcontext,taskcomplexity is bestconceptualizedasthe degreeofinterdependencybetweentheindividualsubtasksthat participantsperformwhencontributingtoaproject.Intasksthat arenotcomplex,thebestsolutiontoasubtaskisindependentof othersubtasks,allowingcontributorstoworkindependently.For example,thecorrectclassificationofanimageinGalaxyZoodoes notdependontheclassificationsofotherimages.Incomplextasks, however,thebestsolutiontoasubtaskdependsonothersubtasks and,assuch,acontributorneedstoconsiderothercontributions whenworkingonhis/herownsubtask.Thetermtaskstructure cap-turesthedegreetowhichtheoveralltaskthatisoutsourcedtothe crowdis“well-structured”versus“ill-structured”.Well-structured tasksinvolveaclearlydefinedsetofsubtasks,thecriteriafor evalu-atingcontributionsarewellunderstood,andthe“problemspace”is essentiallymappedoutinadvance(Simon,1973).Inill-structured tasks,thespecificsubtasksthatneedtobeperformedarenotclear exante,contributionsarenoteasilyevaluated,andtheproblem spacebecomesclearerastheworkprogressesandcontributions buildoneachotherinacumulativefashion.Whiletaskcomplexity andstructurearedistinctconcepts,theyareoftenrelatedinthat complextaskstendtoalsobemoreill-structured,partlyduetoour limitedunderstandingoftheinteractionsamongdifferentsubtasks (FelinandZenger,2012).Consideringdifferencesacrossprojects withrespecttothenatureof thetaskisparticularlyimportant becausecomplexandill-structuredtasksprovidefewer opportu-nitiesforthedivisionoflaborandmayfacedistinctorganizational challenges.Wewilldiscussthesechallengesaswellaspotential mechanismstoaddresstheminSection5.

AsFig. 4 illustrates,a largeshare of crowd scienceprojects involvesprimarilytasksoflowcomplexitythattendtobe well-structured.Forexample,thecodingofimagesfortheGalaxyZoo projectinvolvesa largenumber ofindividualcontributions,but subtasksareindependentand contributorscanworkinparallel withoutregardtowhatothercontributorsaredoing.18Attheother extremeareprojectsthatinvolvehighlycomplexandill-structured tasksrequiring contributors tobuildoneach others’ workin a sequentialfashion,andtodevelopacollectiveunderstandingof theproblemspaceandofpossiblesolutionsovertime.Polymath projectsexhibitthesecharacteristicsandFig.2illustratesthe inter-active problem solving process. Finally, projects located in the middleofthehorizontalaxisinFig.4involvetasksthatareof mod-eratecomplexityandrelativelywell-structured,whilestillallowing contributorstocollaborateandtoexploredifferentapproachesto solvetheproblem.Forexample,Folditplayerscanfindthebest proteinstructureusingverydifferentapproachesthatareoften dis-coveredonlyintheprocessofproblemsolving.Moreover,while

18Onemaythinkthatthesecodingtasksaresosimplethattheirperformance

shouldnotbeconsideredasaseriousscientificcontributionatall.However,data collectionisanintegralpartofscientificresearchandmuchoftheeffort(andmoney) intraditionalresearchprojectsisexpendedondatacollection,oftencarriedoutby PhDsandpost-docs.Infact,recallthatGalaxyZoowasinitiatedbyaPhDstudent whowasoverwhelmedbythetasktocodealargenumberofgalaxies.Reflectingthis importantfunction,contributionsintheformofdatacollectionortheprovisionof dataareoftenexplicitlyrewardedwithco-authorshiponscientificpapers(Haeussler andSauermann,2013).

(11)

Fig.4. Heterogeneityamongcrowdscienceprojects.

someplayersworkindependently,many ofthemostsuccessful solutionshavebeenfoundbymembersoflargerteamswho devel-opedsolutionscollaborativelybybuildingoneachother’sideas.

LetusknowconsidertheverticalaxisinFig.4,whichreflects differentkindsofskillsthatarerequiredtomakemeaningful con-tributionstoa project(Ericksonetal.,2012).19Asnotedearlier, many“citizenscience”projectssuchasGalaxyZooaskfor con-tributionsthatrequireonlyskillsthatarecommoninthegeneral humanpopulation. Otherprojectstypicallyrequireexpertskills inspecificscientificdomains,asevidencedbytheprevalenceof trainedmathematiciansamongPolymathparticipants,orexperts inbioinformaticsintheConnect2Decodeproject.Projectslocated inthemiddleoftheverticalaxisrequireskillsthat are special-izedandlesscommonbutthatarenottiedtoaparticularscientific domain.Forexample,theArgusprojectreliesoncaptainsofships tocollectmeasuresofseabeddepthusingsonarequipment, requir-ingspecializednauticalskills.Similarly,successinproteinfolding requirestheabilitytovisualizeandmanipulatethree-dimensional shapesinspace,aspecialcognitiveskillthathaslittletodowith thetraditionalstudyofbiochemistry.20

4. Potentialbenefitsofcrowdscience

InSection3.1,wecomparedcrowdsciencetootherknowledge productionregimesandsuggestedthatcrowdscienceprojectstend tobemoreopenwithrespecttoparticipationaswellasthe disclo-sureofintermediateinputs.Wenowdiscusshowthishighdegree ofopennesscanprovidecertainadvantageswithrespecttothe creationofnewknowledgeaswellasthemotivationofproject par-ticipants.Section5considerspotentialchallengesopennessmay poseand how these challenges may beaddressed. Throughout thesediscussions,wewillconsiderhowthebenefitsandchallenges

19 Wefocusontheskillsrequiredtomakeameaningfulcontributiontoaproject

andabstractfromthesignificantheterogeneityinskillsamongcontributorstoa givenproject.Inparticular,itislikelythatsomecontributorsbenefitfromhigher levelsofskillsoraccesstoabroaderrangeofskillsthanothers.

20ItisinterestingtonotethatFolditeffectivelyusessophisticatedsoftwareto

transformataskthattraditionallyrequiredscientifictrainingintoataskthatcanbe performedbyamateursusingadifferentsetofskills.

maydifferacrossdifferenttypesofprojects,focusingprimarilyon thedistinctionsdrawninSection3.2.

4.1. Knowledge-relatedbenefits

Thescholarlyliteraturehasdevelopedseveraldifferent concep-tualizationsof thescienceandinnovationprocess,emphasizing differentproblemsolvingstrategiessuchastherecombinationof priorknowledge,thesearchforextremevalueoutcomes,orthe sys-tematictestingofcompetinghypotheses(seeKuhn,1962;Fleming, 2001; Weisberg, 2006;Boudreau et al.,2011; Afuahand Tucci, 2012).Giventhewidespectrumofproblemscrowdscienceprojects seektosolve,itisusefultodrawonmultipleconceptualizationsof problemsolvinginthinkingaboutthebenefitsprojectsmayderive fromopennessinparticipationandfromtheopendisclosureof intermediateinputs.

4.1.1. Benefitsfromopenparticipation

SomeprojectssuchasGalaxyZooorOldWeatherbenefit pri-marily from a larger quantity of laborinputs, using thousands of volunteerstocomplement scarcematerial resourcessuchas telescopesbutalsotheuniqueexpertiseofhighlytrainedproject leaders.Couldprojectsrelyonpowerfulcomputerstoaccomplish thesetaskswithouttheinvolvementofalargenumberofpeople? Itturnsoutthathumanscontinuetobemoreeffectivethan com-putersinseveralrealmsofproblemsolving,includingimageand sounddetection(MaloneandKlein,2007).Humanscanalsorely onintuitiontoimproveoptimizationprocesses,whereascomputer algorithmsoftenuserandomtrial-and-errorsearchandmayget trappedinlocaloptima.Humansalsohaveasuperiorcapacityto narrowthespaceofsolutions,whichisusefulinproblemslike pro-teinfolding,wheretoomanydegreesoffreedommakecomplete computationunfeasiblylong(Cooperetal.,2010).Another advan-tageofhumansistheirabilitytowatchoutfortheunexpected. Thisskillisessentialinmanyrealmsofscience,asevidencedby thediscoveryofnewtypesofgalaxiesandastronomicalobjectsin thecourseofGalaxyZoo’soperation.Benefitsintermsofaccess toalargerquantityoflaborarelikelytobegreatestforprojects involvingtasksthatarewell-structuredandrequireonlycommon humanskills,allowingthemtodrawonaverylargepoolof poten-tialcontributors.Thisrationale mayexplain whymanyexisting

(12)

crowdscienceprojects arelocated inthebottom-left corner of Fig.4.

Otherprojectsbenefitfromopenparticipationbecauseitallows themtoaccessrareandspecializedskills.ProjectssuchasArgus oreBird(whichaskscontributorstoidentifybirdsintheir neigh-borhoods)caneffectively “broadcast”theskillrequirementtoa largenumberofindividualsinthecrowd,enhancingthechances tofindsuitable contributors.Ashighlighted byrecent workon innovationcontests,broadcastsearchmayalsoidentifyindividuals whoalreadypossesspre-existingsuperiorsolutions(Jeppesenand Lakhani,2010;Nielsen,2011).

Finally, crowd science projects may also benefit from the diversityintheknowledge andexperiencepossessedbyproject participants,especiallywhenproblemsolutionsrequirethe recom-binationof different pieces of knowledge(Fleming, 2001; Uzzi andSpiro,2005).Knowledgediversityislikelytobehighbecause projectcontributors tend to come from different demographic backgrounds, organizations, and even scientific fields (Raddick etal.,2013).Moreover,someprojectsalsobenefitfromdiversity withrespecttocontributors’geographiclocation,includingefforts suchaseBird,Argus,ortheGreatSunflowerProject,whichseekto collectcomprehensivedataacrossalargegeographicspace.Inan efforttocollectdataonthebeepopulationacrosstheUnitedStates, forexample,TheGreatSunflowerProjectasksparticipantstogrow aspecialtypeofflowerandtocountthenumberofbeevisitsduring 15-mindailyobservations.Thebenefitsofgeographicdiversityare likelytoaccrueprimarilytoprojectsrequiringobservationaldata wheregeographyitselfplaysakeyrole.

4.1.2. Benefitsfromtheopendisclosureofintermediateinputs Crowdsciencereliesontheparticipationofmanyandoften tem-poraryparticipants.Makingintermediateinputswidelyaccessible enablesgeographicallydispersedindividualstojoinandto partic-ipateviawebinterfaces.Moreover,codifiedintermediateinputs suchas Polymathblogsor Folditrecipesprovidea memoryfor theproject,ensuringthatknowledgeisnotlostwhen contribu-torsleaveandenablingnewcontributorstoquicklycatchupand becomeefficient(seeVonKroghetal.,2003;Haefligeretal.,2008). Artifactssuchasdiscussionslogsmayalsoembodyandconvey socialnormsandstandardsthatfacilitatethecooperationand col-laborationamongparticipants(RullaniandHaefliger,2013).Thus, thereisaninterestingconnectionbetweenthetwodimensionsof opennessinthattheopendisclosureofintermediateinputsallows projectstotakefulladvantageofthebenefitsofopenparticipation (Section4.1.1).

Transparencyoftheresearchprocessand availabilityofdata mayalso facilitate the verification of results (see Lacetera and Zirulia,2011).While peerreviewintraditionalsciencedoesnot typically entail a detailed examination of intermediate inputs, theopennessof logsand dataincrowd scienceprojectsallows observerstofollow andverifytheresearchprocessin consider-abledetail. By way of example, therelatively large number of “eyes”followingthePolymathdiscussionensuredthatmistakes werequicklydetected(seeFig.2).Ofcourse,therearelimitstothis internalverificationsincecontributorswhoarenottrained scien-tistsmaynothavethenecessarybackgroundtoassesstheaccuracy ofmoresophisticateddataormethods.Assuch,theirinvolvement inverificationwillbelimitedtotaskssimilartotheonestheywould beabletoperform.21

Afinalknowledge-relatedbenefitofopendisclosureof inter-mediateinputsemergesnotatthelevelofaparticularprojectbut

21 SomeprojectssuchastheOpenDinosaurprojectorGalaxyZooinstitutionalize

thisverificationprocessbyexplicitlyaskingmultipleparticipantstocodethesame pieceofdataandcomparingtheresults.

forthemoregeneralprogressofscience.Scholarsofsciencehave arguedthatopendisclosureallowsfuturescientiststobuildupon priorworkinacumulativefashion(Nelson,2004;Sorensonand Fleming,2004;Jones,2009).Whilethesediscussionstypicallyrefer tothedisclosureoffinalprojectresults,additionalbenefitsmay accrueifotherscientistscanbuildontheintermediateinputs pro-ducedbyaproject(DasguptaandDavid,1994;Haeussleretal., 2009).Suchbenefitsareparticularlylargefordata,whichare typ-icallycostlytocollectbutcanoftenbeusedtoexaminemultiple researchquestions.Consider,forexample,theSloanDigitalSky Sur-vey,theHumanGenomedata,orCensusdatainthesocialsciences thathaveallresultedinmanyvaluablelinesofresearch.Notonly data,butalsologsanddiscussionarchivescanbeenormously help-fulforfutureresearchiftheyprovideinsightsintosuccessful prob-lemsolvingtechniques.AsillustratedinFig.2,forexample,problem solvingapproachesandintermediatesolutionsdevelopedinone Polymathdiscussionhavebeenre-usedandadaptedinanother. 4.2. Motivationalbenefits

Successful scientific research requires not only knowledge inputs but also scientists who are motivated to actually exert efforttoward solvinga particular problem. Dueto opennessin participation,crowdscienceprojectsareabletorelyon contrib-utors who are willing to exert effort without pay, potentially allowingprojectstotakeadvantageofhumanresourcesatlower financialcostthanwouldberequiredintraditionalscience.22Inthe following,wediscusssomeofthenon-pecuniarypayoffsthatmay mattertocrowdsciencecontributors,aswellasmechanismsthat projectsusetoaddressandreinforcethesetypesofmotivations.As wewillsee,someofthesemechanismsrequireconsiderable infra-structure,suggestingthateventhoughvolunteersareunpaid,their helpisnotnecessarilycostless.

Scholarshaveforalongtimeemphasizedtheroleofintrinsic motives,especiallyinthecontextofscienceandinnovation(Ritti, 1968; Hertelet al.,2003; Sauermannand Cohen, 2010; Raasch andVonHippel,2012).Intrinsicallymotivatedpeopleengagein anactivitybecausetheyenjoytheintellectualchallengeofatask, becausetheyfinditfun,orbecauseitgivesthemafeelingof accom-plishment(Amabile,1996; Ryan andDeci,2000).Such intrinsic motivesappeartobeimportantalsoincrowdscienceprojects,as illustratedinthefollowingpostofaGalaxyZooparticipant:

Ok,I’mcompletelyjealousofthepersonwhogotthatparticular image,itsamazing.Youneedtowarnpeoplejusthowaddictive thisis!Itsdangerous![...].

AfterdoingacouplehundredIwasstartingtoburnout... sud-denlythere wasakelly-greenstarintheforeground.Whoa! [...]beingthefirsttoseethesethings:who*knows*whatyou mightfind?Hooked!23

Intrinsicmotivationmaybeeasytoachievefortasksthatare inherentlyinterestingandchallenging.However,evensimpleand potentiallytedioustasks–suchascodinglargeamountsofdata orsystematicallytryingdifferentconfigurationsofmolecules–can becomeintrinsicallyrewardingiftheyareembeddedinagame-like environment(PrestopnikandCrowston,2011).Assuch,an

increas-22Arelatedargumenthasbeenmadeinthecontextofuserinnovation(Raasch andVonHippel,2012).Whilethereisnoestimateofthecostsavingscrowd sci-enceprojectsmayderivefromrelyingonnon-financialmotives,thereareestimates ofthecostsavingsachievedinOSSdevelopment.Inparticular,oneestimatepegs thevalueofunpaidcontributionstotheLinuxsystematover3billiondollars (http://linuxcost.blogspot.com/2011/03/cost-of-linux.html;retrievedNovember9, 2011).

(13)

ingnumber ofcrowdscienceprojects employ“gamification”to increaseprojectparticipation.Fig.1,forexample,showshowFoldit usesgroupandplayercompetitionstokeepcontributorsengaged. Project participants mayalso feel good aboutbeing part of aparticular projectcommunity andmayenjoy“social”benefits resultingfrompersonalinteractions (Harsand Ou,2002).While sometypesofprojects–especiallythoseinvolvingcollaborative problemsolving–willnaturallyprovidealocusforsocial interac-tion,manyprojectsalsoactivelystimulateinteraction.Forexample, theGreatSunflowerprojectnominatesgroupleaderswho oper-ate as facilitators in particular neighborhoods, communities or schools.24Otherprojectspromotesocialinteractionsbyproviding dedicatedITinfrastructure.Thishasbeenthestrategyemployed byFoldit,whereprojectlogsanddiscussionforumsallow partic-ipantstoteamuptoexchangestrategiesandcompeteingroups, fosteringnotjustenjoymentfromgaming,butalsoacollegialand “social”element.Aparticularlyinterestingaspectofintrinsicand socialbenefitsisthattheymaybenon-rival,i.e.,thebenefitstoone individualarenotdiminishedsimplybecauseanotherindividual alsoderivesthesebenefits(BonaccorsiandRossi,2003).

Anotherimportantnonpecuniarymotiveforparticipationisan interestintheparticularsubjectmatteroftheproject.Towit,when askedabouttheirparticipationmotives,oneofthereasonsmost oftenmentioned bycontributorstoGalaxyZoo wasaninterest inastronomyandamazementaboutthevastnessoftheuniverse (Raddicketal.,2013).Whilethismotiveisintrinsicinthesense thatitisnotdependentonanyexternalrewards,itisdistinctfrom challengeandplaymotivesinthatitisspecifictoaparticulartopic, potentiallylimitingtherangeofprojectsapersonwillbewilling toparticipatein.Recognizingtheimportanceofindividuals’ inter-estinparticulartopics,someplatformssuchasZooniverseenrich theworkbyprovidingscientificbackgroundinformation,by reg-ularlyinformingparticipantsaboutnewfindingsorbyallowing participantstocreatecollectionsofinterestingobjects(Fig.5).

Whileitisnotsurprisingthatprojectsinareaswithalarge num-bersofhobbyists–suchasastronomyorornithology–havebeen abletorecruitlargenumbersofvolunteers,projectsinless interest-ingareas,orprojectsaddressingverynarrowandspecificquestions aremorelikelytofacechallengesintryingtorecruitvolunteers.At thesametime,reachingouttoalargenumberofpotential contrib-utorsmayallowevenprojectswithlesspopulartopicstoidentifya sufficientnumberofindividualswithmatchinginterests.Assuch, whilepriorworkhasemphasizedthatbroadcastsearchcanmatch problemstoindividuals holdinguniqueknowledgeor solutions (Jeppesen andLakhani,2010), wesuggestthatbroadcastsearch canalsoallowthematchingofparticulartopicstoindividualswith relatedinterests.Bywayofexample,batsareprobablynot popu-laranimalsamongmostpeople,buttheprojectBatDetectivehas beenquitesuccessfulinfindingthoseindividualswhoare inter-estedinthisanimalandarewillingtolistentosoundrecordings andtoidentifydifferenttypesofbatcalls.

Anespeciallypowerfulversionofinterestinaparticular prob-lemisevidentinthegrowingnumberofprojectsthataredevotedto theunderstandingofparticulardiseasesortothedevelopmentof cures(ÅrdalandRøttingen,2012).Manyoftheseprojectsinvolve patientsand theirrelativeswho –aspotential“users”–have a verystrongpersonalstakeinthesuccessoftheproject,leading themtomakesignificanttimecommitments(seeVonHippel,2006; Marcus,2011).Totheseparticipants,itmaybeparticularly impor-tantthatprojectsalsoquicklydiscloseintermediateinputsifthey believethatdisclosurespeedsuptheprogresstowardasolutionby allowingotherstobuildupontheseinputs.

24 http://www.greatsunflower.org/garden-leader-program;retrievedOctober3,

2012.

Finally,manycrowdsciencecontributorsalsovaluethe oppor-tunitytocontributetoscience,especiallycitizenscientists,who lack the formal training to participate in traditional science (Raddicketal.,2013).AsoneGalaxyZoocontributorcommented inablogdiscussion:

Dang...thisisaddictive.Iputonsomemusicandstarted clas-sifying,andthenextthingyouknowit’shourslater.ButI’m contributingtoscience!25

Likesomeoftheothermotivesdiscussedabove,thismotive maybeparticularlybeneficialforcrowdsciencetotheextentthat individualsarewillingtoparticipateinabroadrangeofdifferent projectsandareevenwillingtohelpwith“boring”andtedioustasks aslongastheseareperceivedtobeimportantfortheprogressof science.

5. Challengesandpotentialsolutions

Opennesswithrespecttoprojectparticipationandintermediate inputscanleadtoconsiderablebenefits,butthesame characteris-ticsmayalsocreatecertainchallenges.We nowhighlightsome ofthesechallengesanddrawonthebroaderorganizational liter-aturetopointtowardorganizationalandtechnicaltoolsthatmay beusefulinaddressingthem.

5.1. Organizationalchallenges 5.1.1. Matchingprojectsandpeople

Onekeyfeatureofcrowdscienceprojectsistheiropennessto thecontributionsofalargenumberofindividuals.However,there isalargeandincreasingnumberofprojects,andthepopulationof potentialcontributorsisvast.Thus,organizationalmechanismsare neededtoallowfortheefficientmatchingofprojectsandpotential contributorswithrespecttobothskillsandinterests.One poten-tialapproachmakes iteasierfor individualstofindprojectsby aggregatinganddisseminatinginformationonongoingorplanned projects.Websitessuchasscistarter.com,forexample,offer search-abledatabasesofawiderangeofprojects,allowingindividualsto findprojectsthatfittheirparticularinterestsandlevelsof exper-tise.

We expect that an efficient alternative mechanism may be crowd science platforms that host multipleprojects and allow themtoutilizeasharedpoolofpotentialcontributorsaswellas a sharedtechnicalinfrastructure. Especiallyif projectsare sim-ilar withrespectto thefield of science, typesof tasks, or skill requirements, individuals who contributed to one project are alsolikelytobesuitable contributorstoanotherprojectonthe sameplatform.Indeed,multi-projectcrowdscienceplatformsmay enjoyconsiderablenetwork effectsbysimultaneouslyattracting more projects looking for contributors and contributors look-ing for projects to join. Such platforms are common in open sourcesoftwaredevelopment(e.g.,sourceforge.comandGitHub) andarealsoemerginginthecrowdsciencerealm(seeTable1). For example,the GalaxyZoo projecthasevolved intothe plat-formZooniverse,whichcurrentlyhostsoverfifteenprojectsthat cover different areas of science but involve very similar kinds of tasks. Whena new projectis initiated, Zooniverse routinely contacts its large and growing member baseto recruit partic-ipants, and many individuals contributetomultiple Zooniverse projects.

25 http://blogs.discovermagazine.com/badastronomy/2009/04/02/a-million-galaxies-in-a-hundred-hours/#.UU8TPVs4Vxg.

(14)

Fig.5.GalaxyZooobjectwithselectedusercomments.Source:http://talk.galaxyzoo.org/objects/AGZ0002bw2.

5.1.2. Divisionoflaborandintegrationofcontributions

Asdiscussedinsection3.2,projectsdifferwithrespecttothe complexity and structure of the taskthat is outsourced tothe crowd.Themorecomplex andill-structuredthetask, themore contributorshavetointeractand buildoneachother’s contrib-utions,limitingthenumberofpeoplewhocanworkonagiven projectatthesametime.Whileourdiscussioninsection3.2took thenatureofthetaskasgiven,projectorganizerscantrytoreduce interdependenciesandstructuretaskssuchthatalargernumber ofindividualscanparticipate,effectivelymovingprojectstoward theleftofFig.4.Opensourcesoftwaredevelopmenthas devel-opedsophisticatedmodularizationtechniquestoachievesimilar goals.Thebasicideaofmodularizationisthatalargeproblemis dividedintomanysmallerproblems,plusastrategy(the architec-ture)thatspecifieshowthemodulesfittogether.Thegoalisto designmodulesthathaveminimuminterdependencieswithone another,allowingforagreaterdivisionoflaborandparallelwork (VonKroghetal.,2003;BaldwinandClark,2006).Modularization hasalreadybeenusedinmanycrowdscienceprojects.Forexample, GalaxyZooandOldWeatherkeepthedesignoftheoverallproject centralizedandprovideindividualcontributorswithasmallpiece ofthetasksothattheycanworkindependentlyandattheirown pace.Ofcourse,notallproblemscanbeeasilymodularized;we sus-pect,forexample,thatmathematicalproblemsarelessamenable tomodularization.However,whilethenatureoftheproblemmay setlimitstomodularizationintheshortterm,advancesin infor-mationtechnologyandprojectmanagementknowledgearelikely toincreaseprojectleaders’abilitytomodularizeagivenproblem overtime(seeSimon,1973).

Justas important as distributing tasks is theeffective inte-grationofindividuals’ contributionstofindtheoverallproblem solution.Inhighlymodularizeddatacollectionandcodingtasks suchasOldWeatheroreBird,individualcontributionscan eas-ilybeintegratedintolargerdatasets.Similarly,insomeproblem

solvingtaskssuchasFoldit,individualsorteamsgenerate stand-alonesolutionsthatcanbeevaluatedusingstandardcriteriaand thatrequirelittleintegration(see Jeppesenand Lakhani,2010). Thebiggest challengeistheintegrationofcontributions in col-laborative problemsolving tasks such as Polymath, where the contributorsseekto develop a singlesolutionin an interactive fashion,e.g.,throughanongoingdiscussion.Insuchcases,much of the integrationis doneinformally as participantsread each other’s contributions. However, as the amount of information that needstoberead and processedincreases, informal mech-anisms may miss important contributions while also imposing largetime costs onproject participants(Nielsen, 2011). Filter-ing and sorting mechanisms may lower these costs to some extent,butdifficultiesinintegratingthecontributionsofalarger number of participants are likely to impose limits upon the optimalsizeofcollaborativeproblemsolvingprojectssuchas Poly-math.

5.1.3. Projectleadership

Mostcrowd scienceprojectsrequirea significantamountof projectleadership.Dependingonthenatureoftheproblem,leaders arefundamentalinframingthescientificexperiment, modulariz-ingthetask,securingaccesstofinancialandtechnicalresources, ormakingdecisionsregardinghowtoproceedatcriticaljunctures ofaproject(Mateos-GarciaandSteinmueller,2008;Wigginsand Crowston,2011).

Open source software projects such as Linux illustrate that leadershipintheformofarchitectureand kerneldesign canbe performed by a relatively small and tightly knit group of top-notchprogrammers,whilealargenumberofpeopleatalllevelsof skillscanexecutesmallerandwell-definedmodules(Shah,2006). EmergingcrowdscienceprojectssuchasGalaxyZooorFolditshow asimilarpattern:Theseprojectsaretypicallyledbywell-trained scientistswhoformulateimportantresearchquestionsanddesign

References

Related documents

As in Table 1, the first and third rows are estimated using as explanatory variables the share of immigrants in the education-age cell, while in the second and fourth row we

Security, Privacy, and Ethics Emerging Trends and Innovation Retooling Enterprise Technology for Business Continuity. Collaboration and Partnerships Across

The characterization of breathing sound was carried by Voice Activity Detection (VAD) algorithm, which is used to measure the energy of the acoustic respiratory signal during

Acknowledging the characteristic time dimension of standardized testing can further reveal the dilemma of the therapist test administrator: Administering a test demands focus on the

El trazado de Alta Velocidad de la Alternativa 1 discurre a lo largo de diez kilómetros posicionándose al norte de la red de ferrocarril actual. Cruza la Autovía

The measure of the effective tax rates derived in Section 3 varies according to the rate of profit earned on the investment project, but – in the absence of personal taxes –

However, other than a brief study of edible insects sold in a market in Khon Kaen Municipality (Wata- nabe & Satrawaha 1984), no detailed research has been done about the

Events like those held by The Floating Cinema extend the spectacular elements of film exhibition beyond the screen and in doing so bring cinema’s ways of seeing to bear on