The dynamics of collective social behavior in a crowd controlled game

(1)

R E G U L A R A R T I C L E

Open Access

The dynamics of collective social behavior in

a crowd controlled game

Alberto Aleta

1,2*

_{and Yamir Moreno}

1,2,3

*_{Correspondence:}

[email protected]

1_{Institute for Biocomputation and}

Physics of Complex Systems, University of Zaragoza, Zaragoza, Spain

2_{Department of Theoretical Physics,}

University of Zaragoza, Zaragoza, Spain

Full list of author information is available at the end of the article

Abstract

Despite many efforts, the behavior of a crowd is not fully understood. The advent of modern communication means has made it an even more challenging problem, as crowd dynamics could be driven by both human-to-human and human-technology interactions. Here, we study the dynamics of a crowd controlled game (Twitch Plays Pokémon), in which nearly a million players participated during more than two weeks. Unlike other online games, in this event all the players controlled exactly the same character and thus it represents an exceptional example of a collective mind working to achieve a certain goal. We dissect the temporal evolution of the system dynamics along the two distinct phases that characterized the game. We find that having a fraction of players who do not follow the crowd’s average behavior is key to succeed in the game. The latter finding can be well explained by annth order Markov model that reproduces the observed behavior. Secondly, we analyze a phase of the game in which players were able to decide between two different modes of playing, mimicking a voting system. We show that the introduction of this system clearly polarized the community, splitting it in two. Finally, we discuss one of the peculiarities of these groups in the light of the social identity theory, which appears to describe well some of the observed dynamics.

Keywords: Crowd behavior; Social identity theory; Collective behavior; Swarm systems

1 Introduction

Collective phenomena have been the subject of intense research in psychology and soci-ology since the XIX century. There are several ways in which humans gather to perform collective actions, although observations suggest that most of them require some sort of diminution of self-identity [1]. One of the ﬁrst attempts to address this subject was Le Bon’s theory on the psychology of crowds in which he argued that when people are part of a crowd they lose their individual consciousness and become more primitive and emotional thanks to the anonymity provided by the group [2]. In the following decades, theories of crowd behavior such as the convergence theory, the emergent norm theory or the social identity theory emerged. These theories shifted away from Le Bon’s original ideas, intro-ducing rationality, collective norms and social identities as building blocks of the crowd [3,4].

The classical view of crowds as an irrational horde led researchers to focus on the study of crowds as something inherently violent, and thus, to seek for a better understanding

(2)

and prediction of violence eruption, or at least, to develop some strategies to handle them [5]. However, the information era has created a new kind of crowd, as it is no longer nec-essary to be in the same place to communicate and take part of collective actions. Indeed, open source and “wiki” initiatives, as well as crowdsourcing and crowdworking, are some examples of how crowds can collaborate online in order to achieve a particular objective [6,7]. Although this oﬀers a plethora of opportunities, caution has to be taken because, as research on the psychology of crowds has shown, the group is not just the simple addition of individuals [8]. For example, it has been observed that the group performance can be less eﬃcient than the sum of the individual performances if they had acted separately [9]. Under which conditions this happens and whether the group is more than the individuals composing it are two current challenges of utmost importance if, for instance, one wants to use crowds as a working force.

To be able to unlock the potential of collective intelligence, a deeper understanding of the functioning of these systems is needed [10]. Examples of scenarios that can benefit from further insights into crowd behavior include new ways to reach group decisions, such as voting, consensus making or opinion averaging, as well as finding the best strate-gies to motivate the crowd to perform some task [11]. Regarding the latter, as arbitrary tasks usually are not intrinsically enjoyable, to be able to systematically execute crowd-sourcing jobs, some sort of financial compensation is used [12]. This, however, implies dealing with new challenges, since many experiments have demonstrated that financial incentives might undermine the intrinsic motivation of workers or encourage them to only seek for the results that are being measured, either by focusing only on them or by free-riding [13–15]. A relevant case is given by platforms such as Amazon’s Mechanical Turk, that allow organizations to pay workers that perform micro-tasks for them, and that have already given rise to interesting questions about the future of crowd work [16]. In particular, its validity to be used for crowdsourcing behavioral research has been recently called into question [17].

Notwithstanding the previous observations, it is possible to ﬁnd tasks that are intrsically enjoyable by the crowd due to their motivational nature, which is ultimately in-dependent of the reward [15]. This is one of the basis of online citizen science. In these projects, volunteers contribute to analyze and interpret large datasets which are later used to solve scientiﬁc problems [18]. To increase the motivation of the volunteers, some of these projects are shaped as computer games [19]. Examples range from the study of pro-tein folding [20] to annotating people within social networks [21] or identifying the pres-ence of cropland [22].

(3)

specially interesting because it represents an out of the lab social experiment that became extremely successful based only on its intrinsic enjoyment and, given that it was run with-out any scientiﬁc purpose in mind, it represents a natural, unbiased (i.e., not artiﬁcially driven) opportunity to study the evolution and organization of crowds.

2 Description of the event

On February 12, 2014, an anonymous developer started to broadcast a game of Pokémon Red on the streaming platform Twitch [27]. Pokémon Red was the ﬁrst installment of the Pokémon series, which is the most successful role playing game (RPG) franchise of all time [28]. The purpose of the game was to capture and train creatures known as Pokémons in order to win increasingly diﬃcult battles based on classical turn-based combats. However, as Pokémon Go showed in the summer of 2016, the power of the Pokémon franchise goes beyond the classical RPG games and is still able to attract millions of players [29].

On the other hand, Twitch is an online service for watching and streaming digital video broadcast. Its content is mainly related to video games: from e-sports competitions to pro-fessional players games or simply popular individuals who tend to gather large audiences to watch them play, commonly known as streamers. Due to the live nature of the streaming and the presence of a chat window where viewers can interact among each other and with the streamer, the relationship between the media creator and the consumer is much more direct than in traditional media [30]. Back in February 2014, Twitch was the 4th largest source of peak internet traﬃc in the US [31] and nowadays, with over 100 million unique users, it has become the home of the largest gaming community in history [32].

The element that distinguished this stream from the rest was that the streamer did not play the game. Instead, he set up a bot in the chat window that accepted some predefined commands and forwarded them to the input system of the video game. Thus, anyone could join the stream and control the character by just writing one of those actions in the chat. Although all actions were sent to the video game sequentially, it could only perform one at a time. As a consequence, all commands that arrived while the character was performing a given action (which takes less than a second) did not have any effect. Thus, it was a completely crowd controlled game without any central authority or coordination system in place. This was not a multiplayer game, this was something different, something new [33].

Due to its novelty, during the ﬁrst day the game was mainly unknown with only a few tens of viewers/players and as a consequence little is known about the game events of that day [34]. However, on the second day it started to gain viewers and quickly went viral, see Fig.1. Indeed, it ramped up from 25,000 new players on day 1 (note that the time was recorded starting from day 0 and thus day 1 in game time actually refers to the second day on real time) to almost 75,000 on day 2 and an already stable base of nearly 10,000 continuous players. Even though there was a clear decay on the number of new users after day 5, the event was able to retain a large user base for over two weeks. This huge number of users imposed a challenge on the technical capabilities of the system, which translated in a delay of between 20 and 30 seconds between the stream and the chat window. That is, users had to send their commands based on where the player was up to 30 seconds ago.

(4)

Figure 1New users per day. The histogram is ﬁtted to a gamma distribution of parametersα= 2.66 and β= 0.41. Note that this reﬂects those users who inputted at least one command, not the number of viewers. In the inset we show the total number of users who sent at least 1 message each hour, regardless on whether they were new players or not

complete the game. There are 4 movement commands (up,right,downandleft), 2 actions commands (aandb,acceptandback/cancel) and 1 system button (startwhich opens the game’s menu). As a consequence the gameplay is simple. The character is moved around the map using the four movement commands. If you encounter a wild Pokémon you will have to fight it with the possibility of capturing it. Then, you will have to face the Pokémons of trainers controlled by the machine in order to obtain the 8 medals needed to finish the game. The combats are all turn-based so that time is not an important factor. In each turn of a combat the player has to decide which action to take for which the movement buttons along withaandbare used. Once the 8 medals have been collected there is a final encounter after which the game is finished. This gameplay, however, was much more complex during TPP due to the huge number of players sending commands at the same time and the lag present in the system.

A remarkable aspect of the event is that actions that would usually go unnoticed, such as selecting an object or nicknaming a Pokémon, yielded unexpected outcomes due to the messy nature of the gameplay. The community embraced these outcomes and created a whole narrative around them in the form of jokes, fan art and even a religion-like move-ment based on the judeo-christian tradition [36] both in the chat window and in related media such as Reddit. Although these characteristics of the game are outside of the scope of the paper, we believe that it would be interesting to address them as an example of the evolution of naming conventions and narrative consensus [37].

(5)

Figure 2Command distribution after the ﬁrst introduction of the voting system. Once the system was back online votes would tally up over a period of 10 seconds. After 15 minutes the system was brought down to reduce this time to 5 seconds. This, however, did not please the crowd and it started to protest. The ﬁrststart9 was sent at 5d9h8m but went almost unnoticed. Few minutes after it was sent again but this time it got the attention of the crowd. In barely 3 minutes it went from 4start9per minute to over 300, which stalled the game for over 8 minutes. The developer brought down the system again and removed the voting system, introducing the anarchy/democracy system a few hours later

original rules (see Fig.2). However, two hours later the system was modiﬁed again. Two new commands were added:democracyandanarchy, which controlled some sort of tug-of-war voting system over which rules to use. If the fraction of people voting for democracy went over a given threshold, the game would start to tally up votes about which action to take next. If not, the game would be played using the old rules. This system split the community into “democrats” and “anarchists” who would then ﬁght for taking control of the game while trying to progress.

3 Results

(6)

Figure 3Network representation of the ledge area. It is possible to go from a node to the ones surrounding it using the commandsup,right,downandleft. The only exception are the red nodes which are ledges. If the character tries to step on one of those nodes it will be automatically sent to the node right below it, characteristic that is represented by the curved links connecting nodes above and below ledges. Yellow nodes mark the entrance and exit of the area and blue nodes highlight the most diﬃcult part of the path. Note that as the original map was composed by squared tiles this network representation is not an approximation but the exact shape of the area

3.1 The ledge

On the third day of the game, the character arrived to the area depicted in Fig.3(note that the democracy/anarchy system had not been introduced yet). Each node of the graph represents a tile of the game. The character starts on the yellow node on the left part of the network and has to exit through the right part, an event that we will define as getting to one of the yellow nodes on the right. The path is simple for an average player but it represented a challenge for the crowd due to the presence of the red nodes. These nodes represent ledges which can only be traversed going downwards, effectively working as a fil-ter that allows flux only downwards. Thus, one good step will not cancel a bad step, as the character would be trapped down the ledge and will have to find a different path to go up again. For this reason, this particular region is highly vulnerable to actions deviating from the norm, either caused by mistake or performed intentionally by griefers, i.e., individuals whose only purpose is to annoy other players and who do so by using the mechanisms provided by the game itself [38,39] (note that in social contexts these individuals are usu-ally called trolls [40]). Indeed, there are paths (see blue nodes in Fig.3) where only the commandrightis needed and which are next to a ledge so that the commanddown, which is not needed at all, will force the crowd to go back and start the path again. Additionally, the existence of the lag described in Sect.2made this task even more difficult.

In Fig. 4(a) we show the time evolution of the amount of messages containing each command (the values have been normalized to the total number of commands sent each minute) since the beginning of this part until they finally exited. First, we notice that it took the crowd over 15 hours to finish an area that can be completed by an optimal walk in less than 2 minutes. Then, we can clearly see a pattern from 2d18h30m to the first time they were able to reach the nodes located right after the blue ones, approximately 3d01h10m: when the number of rightsis high the number ofleftsis low. This is a signature of the character trying to go through the blue nodes by going right, falling down the ledge, and going left to start over. Once they finally reached the nodes after the blue path (first ar-rival) they had to fight a trainer controlled by the game, combat which they lost and as a consequence the character was transported outside of the area and they had to enter and start again from the beginning. Again, we can see a similar left-right pattern until they got over that blue path for the second time, which in this case was definitive.

(7)

Figure 4Study of the Ledge event. (a) Time evolution of the fraction of commands sent each minute. Note that a single player should be able to ﬁnish this area in a few minutes, but the crowd needed 15 hours. The time series has been smoothed using moving averages. (b) Hierarchical clustering of the time series of each group of users (see main text for details). (c) Left: Mean time needed to exit the area according to our simulations as a function of the fraction of griefers in the system and the noise in it. Right: 1% quantile of the time needed to exit the area, note that theyaxis is given in minutes instead of hours

facilitates the analysis. At the same time, it took the players much longer to ﬁnish this area than what is expected for a single player. To address all these features, we propose a model aimed at mimicking the behavior of the crowd. Speciﬁcally, we consider anth order Markov Chain so that the probability of going from statexmtoxm+1depends only

on the statexm–n, thus accounting for the eﬀect of the lag of the dynamics. Furthermore,

the probabilities of going from one state to another will be set according to the behavior of the players in the crowd.

To define these probabilities, we first classify the players in groups according to the total number of commands they sent in this period: G1, users with 1 or 2 commands (46% of the users); G2, 3 or 4 commands (18%); G3, between 5 and 7 commands (13%); G4, between 8 and 14 commands (12%); G5, between 15 and 25 commands (6%); and G6, more than 25 commands (5%). These groups were defined so that the total number of messages sent by the first three is close to 50,000 and 100,000 for the other three. Interestingly, the time series of the inputs of each of these groups are very similar (see Additional files1–7). Actually, if we remove the labels of the 42 time series and cluster them using the euclidean distance, we obtain 7 clusters, one for each command. Even more, the time series of each of the commands are clustered together, Fig.4(b). In other words, the behavior of users with medium and large activities are not only similar to each other, but they are also equivalent to the ones coming from the aggregation of the users who only sent 1 or 2 commands. This allows us to infer the behavior of the whole crowd by looking at the bahavior of the most active players, group 6.

(8)

is, the probability of going from statexmtoxm+1is the probability of going fromxm–nto

the state which would follow the optimal path atxm–n+1), the system gets stuck in a loop

and is never able to reach the exit. However, that would require all players to be sending exactly the same command at the same time, something that is not seen in the data nor expected in a (uncontrolled) crowd. Thus, we consider that each time step there are 100 users with different behaviors introducing commands. In particular, we consider variable quantities of noisy users who play completely at random, griefers who only press down to annoy the rest of the crowd and the herd who always sends the optimal command to get to the exit. The results, Fig.4(c), show that the addition of noise to the herd breaks the loops and allows the crowd to get to the exit (similar results are obtained for either 20 or 30 seconds of lag, see Additional file8). In particular, for the case with no griefers we find that with 1 percent of users adding noise to the input the mean time needed to finish this part is almost 3,000 hours. However, as we increase the noise, time is quickly reduced with an optimal noise level of around 40% of the crowd. Conversely, the introduction of griefers in the model, as expected, increases the time needed to finish this part in most cases. Interestingly though, for low values of the noise, the addition of griefers can actu-ally be beneficial for the crowd. Indeed, by breaking the herding effect, these players are unintentionally helping the crowd to reach their goal.

Whether the individuals categorized as “noise” were producing it unintentionally or do-ing it on purpose to disentangle the crowd (an unknown fraction of users were aware of the eﬀects of the lag and they tried to disentangle the system [41]) is something we can not analyze because, unfortunately, the resolution of the chat log in this area is in minutes and not in seconds. We can, however, approximate the fraction of griefers in the system thanks to the special characteristics of this area. Indeed, as most of the time the command

downis not needed—on the contrary, it would destroy all progress—, we can categorize those players with an abnormal number ofdownsas griefers. To do so, we take the users that belong to G6 (the most active ones) and compare the fraction of their inputs that corresponds todownbetween each other. We ﬁnd that 7% have a behavior that could be categorized as outlier (the fraction of their input corresponding todownis higher than 1.5 times the inter quartile range). More restrictively, for 1% of the players, the command

downrepresents more than half of their inputs. Both these values are compatible with the observed time according to our model, even more if we take into account that the model is more restrictive as we consider that griefers continuously press down (not only near the blue nodes). Thus, we conclude that users deviating from the norm, regardless of being griefers, noise or even very smart individuals, were the ones that made ﬁnishing this part possible.

3.2 Anarchy vs. democracy

(9)

The introduction of the voting system was mainly motivated by a puzzle where the crowd had been stuck for over 20 hours with no progress. Nonetheless, even in democ-racy mode, progress was complex as it was necessary to retain control of the game mode plus taking into account lag when deciding which action to take. Actually, the tug-of-war system was introduced at the middle of day 5, yet the puzzle was not fully completed until the beginning of day 6, over 40 hours after the crowd had originally arrived to the puzzle. One of the reasons why it took so long to finish it even after the introduction of the voting system is that it was very difficult to enter into democracy mode. Democracy was only “allowed” by the crowd when they were right in front of the puzzle and they would go into anarchy mode quickly after finishing it. Similarly, the rest of the game was mainly played under anarchy mode. Interestingly, though, we find that there were more “democrats” in the crowd (players who only voted for democracy) than “anarchists” (play-ers who only voted for anarchy). Out of nearly 400,000 play(play-ers who participated in the tug-of-war throughout the game, 54% were democrats, 28% anarchists and 18% voted at least once for both of them. Therefore, the introduction of this new system did not only split the crowd into two polarized groups with, as we shall see, their own norms and be-haviors, but also created non trivial dynamics between them.

To explore the dynamics of these two groups, we next compare two different days: day 6 and day 8. Day 6 was the second day after the introduction of the anarchy/democracy dynamics and there were not any extremely difficult puzzles or similar areas where democ-racy might have been needed. On the other hand, day 8 was the day when the crowd arrived to the safari zone, which certainly needed democracy mode since the available number of steps in this area is limited (see description of Additional file9). We must note that, con-trary to what we observed in Sect.3.1, in this case commands coming from low activity users are not equivalent to the ones coming from high activity users. In particular, low ac-tivity users tend to vote much more for democracy (see Additional files9and10). As such, it would not be adequate to remove them from the analysis. Our results are summarized in Fig.5.

(10)

Figure 5Politics of the crowd. Days 6 (top) and 8 (bottom). In every plot the gray color represents when the game was played under anarchy rules and the blue color when it was played under democracy rules. The polar plots represent the evolution of the fraction of votes corresponding to anarchy/democracy while distinguishing if the user previously voted for anarchy or democracy: ﬁrst quadrant, votes for anarchy coming from users who previously voted for anarchy (A→A); second quadrant, votes for democracy coming from anarchy (A→D); third quadrant, votes for democracy coming from democracy (D→D); fourth quadrant, votes for anarchy coming from democracy (D→A). In the other plots we show the evolution of the total number of votes for anarchy or democracy as a function of time normalized by its maximum value (pink) as well as the position of the tug-of-war meter (black). When the meter goes above 0.75 the system enters into democracy mode (blue) until it reaches 0.25 (these thresholds were later changed to 0.80 and 0.50 respectively) when it enters into anarchy mode (gray) again. The gap in the pink curve of picture (d) is due to the lack of data in that period (seeAvailability of data and materials)

as going through a mace, and once they achieve the target they instantly lose interest in democracy.

(11)

4 Discussion

Probably the ﬁrst thing that comes to ones mind when thinking on how progressing was possible in a scenario like this is the famous experiment by Francis Galton in which he asked a crowd to guess the weight of an ox. He found that the average of all estimates of the crowd was just 0.8% higher than the real weight [44]. Indeed, when lots of users were playing, the extreme answers would cancel each other and the character would tend to move towards the most common command sent by the crowd. Note, however, that as most of the time they were not voting, actions deviating from the mean could also be performed by pure chance, as we saw in Sect.3.1.

It is worth stressing that, to form a classical wise crowd, some important elements are needed, such as independence [45]. That is, the answer of each individual should not be influenced by other people in the crowd. In our case, this was not true, as the character was continuously moving. Indeed, one of the main features of this crowd event is that opinions had effect in real time, and hence, people could see the tendency of the crowd and change its behavior accordingly. Theoretical [46] and empirical studies [47] have shown that a minority of informed individuals can lead a naïve group of animals or humans to a given target in the absence of direct communication. This was observed in Sect.3.1, as it was strictly necessary to introduce individuals who would destroy the herding effect in order to correctly direct the crowd. Besides, even in the case of conflict in the group, the time taken to reach the target is not increased significantly [47] which would explain why, even with the presence of griefers, it only took the crowd 10 times more to finish the game than the average person. Although this ratio may appear to be high, it is not: the crowd got stuck in some parts of the game for over a day, increasing the time to finish. If those parts were excluded, the game progress can be considered to be remarkably fast, despite the messy nature of the gameplay.

As a matter of fact, the movement of the character on the map can be probably better described as a swarm rather than as a crowd. Classical collective intelligence, such as the opinions of a crowd obtained via polls or surveys, has the particularity stated previously of independence and, in addition, being asynchronous. Even more, it has been shown that when users can influence each other but still in an asynchronous way, the group decisions are distorted by social biasing effects [48]. Recently, it has been proposed that the use of structures similar to natural swarms can correct some of these problems [49]. Indeed, by allowing users to participate in decision making processes in real time with feedback about what the rest is doing, in some sort of human swarm, it is possible to explore more effi-ciently the decision space and reach more accurate predictions than with simple majority voting [50]. Admittedly, it has recently been suggested that online crowds might be better described as swarms, that is, as something in-between crowds and networks [51].

(12)

tried to progress for longer. However, we saw that coordination might not be so desirable in this occasion. The problem with players conforming with the majoritarian direction or mimicking each other is that they can be subject to herding eﬀects [54,55] which in this particular setting was catastrophic due to the lag present in the system. Indeed, as we have shown, the addition of players not following the actions of the group, regardless of their motivations to do so, was the key element that allowed to complete the ledge.

Another interesting aspect of the game was the introduction of the anarchy/democracy dynamics. One of the first questions that arises is what might have motivated players to join into one group or the other. From a broad perspective, it has been proposed that one of the key ingredients behind video game enjoyment is the continuous perception of one’s causal effects on the environment, also known as effectance [56], thanks to their immedi-ate response to player inputs. In contrast, it has been observed that a reduction of control, defined as being able to influence the dynamics according to one’s goals, does not auto-matically lower enjoyment [57]. This might explain why some people preferred anarchy. Under their rules, players saw that the game was continuously responding to inputs, even if they were not exactly the ones they sent. On the other hand, with democracy, control was higher at the expense of effectance, as the game would only advance once every 5 seconds. The fact that some people might have preferred this mode is not surprising as it is well known that different people might enjoy different aspects of a game [58]. In the classical player classification proposed by Bartle [59] for the context of MUDs (multi-user dungeon, which later evolved into what we now today as MMORPGs—massively multi-player online role-playing games) he already distinguished four types of multi-players: achievers, who focus on finishing the game (who in our context could be related to democrats); ex-plorers, who focused on interacting with the world (anarchists); socializers, who focused on interacting with other players (those players who focused on making fan art and devel-oping narratives); and killers, whose main motivation was to kill other players (griefers). Similarly, it has been seen in the context of FPSs (first person shooters) that player-death events, i.e., loosing a battle, can be pleasurable for some players (anarchists) while not for others (democrats) [60].

(13)

Figure 6start9protests throughout the game. Left: fraction of input corresponding to thestart9command. Right: fraction of users sendingstart9who where in the originalstart9riot (inset, total number of protesters each day). There werestart9protests 10 days after the ﬁrst one even though less than 10% of the protesters had been part of the ﬁrst one

Figure 7Measures of frustration. (A) Players expressed their frustration by adding more times the lettero when they wanted to sayno. Even though frustration was present throughout the event, it was incremented after the events ofbloody Sunday. (B) Distribution of the number ofo. Interestingly, the relationship is not linear as the wordnootends to appear less thannoooornoooo, which indicates that when players were frustrated they overexpressed it. (C) Number of messages containing the word “why” per hour. This indicates that many players did not understand the movements of the crowd, which probably made them feel frustrated

point of view when they were playing under anarchy rules, but when the game entered into democracy mode it suddenly turned into an acceptable behavior, as predicted by the the-ory.

(14)

examples can be considered as indirect empirical measures of the amount of frustration held in the crowd.

Even though usually frustration has a negative connotation, in the context of games it has been observed that frustration and stress can be pleasurable as they motivate players to overcome new challenges [62]. Actually, there is a whole game genre known as “maso-core” (a portmanteau of masochism and hardcore) which consists of games with extremely challenging gameplay built with the only purpose of generating frustration on the players [63]. Similarly, there are games which might be simpler but that have really diﬃcult con-trols and strange physics, such as QWOP, Surgeon Simulator or Octodad, which are also built with the sole aim of generating frustration [64]. Thus, the mistakes performed by the crowd might have not been something dissatisfactory but completely the opposite, they might have been the reason why this event was so successful [65].

Additional material

Additional ﬁle 1:Command a. Time series of the fraction ofain the input of each group. (PDF 165 kB)

Additional ﬁle 2:Command b. Time series of the fraction ofbin the input of each group. (PDF 168 kB)

Additional ﬁle 3:Command start. Time series of the fraction ofstartin the input of each group. (PDF 164 kB)

Additional ﬁle 4:Command up. Time series of the fraction ofupin the input of each group. (PDF 175 kB)

Additional ﬁle 5:Command right. Time series of the fraction ofrightin the input of each group. (PDF 171 kB)

Additional ﬁle 6:Command down. Time series of the fraction ofdownin the input of each group. (PDF 163 kB)

Additional ﬁle 7:Command left. Time series of the fraction ofleftin the input of each group. (PDF 165 kB)

Additional file 8:Effect of lag. Mean time needed to exit the area according to our model with (A) 20 seconds of delay between the input and the image or (B) 30 seconds of delay between the input and the image. As stated in the main text, there was a delay in the system that ranged between 20 and 30 seconds. The amount varied depending on the number of players, the load of Twitch servers, the effect of some technical corrections that were made, etc. Thus, the particular lag at every moment is undetermined. In the main text we have used the intermediate value of 25 seconds, although as we can see in Additional file8the results are equivalent if we had set the lag to 20 or 30 seconds. (PDF 275 kB)

Additional file 9:Tug of war commitment, 2 votes. Meter position of the political tug of war if only votes from committed players (those who sent at least 2 votes throughout the whole game) are taken into account (blue) and if only votes from visitors (those players who only participated once in the voting) are taken into account (pink). The introduction of the voting system was mainly motivated by a puzzle where the crowd had been stuck for over 20 hours with no progress, but it was not the only reason. It was known that later in the game the character would need to get through a special area, the safari zone, where the number of steps one can take is limited to 500. If it goes over that limit the character would then be automatically teleported outside of the area. Voting seemed the best way to go through this area because even though progress can be slow, it is only necessary half of the crowd, at most, to be coordinated. The democracy system also added a new element, the possibility of sending compound commands. Indeed, under democracy mode it was possible to concatenate up to 9 commands, either by writing one after the other or by adding a number which would repeat the previous command that number of times. For example, aleftrightwould be equivalent to executea, thenleftand lastlyright. Similarly,start9was equivalent to pressingstart nine times in a row. However, the relevance of these commands in the results was limited. First, because most of the time the game was played under anarchy mode. And second, because the total amount of messages containing those commands is much lower than the ones for the simple ones. As briefly discussed in the main text, in contrast with the ledge event, in this case the behavior of users who sent few commands differs from the ones with several commands. In Additional file9we show the hypothetical position of the meter if only users who sent just 1 command are taken into account and if only users with 2 or more commands are taken into account. The position of the meter matches perfectly with this second set of users. If we increase the threshold and consider only users with less than 10 commands as visitors and with 10 or more as committed players, the position of the meter still resembles the results of this second group, even though differences are now noticeable. Interestingly, though, the visitors were clearly in favor of democracy as it can be noticed from the position of their votes and the slight shift downwards of the position of the committed players votes, Additional file10. (PDF 816 kB)

Additional ﬁle 10:Tug of war commitment, 10 votes. Meter position of the political tug of war if only votes from committed players (those who sent at least 10 votes throughout the whole game) are taken into account (blue) and if only votes from visitors (those players who sent less than 10 votes) are taken into account (pink). (PDF 987 kB)

Acknowledgements

(15)

Abbreviations

TPP, Twitch Plays Pokémon; RPG, Role Playing Game; MUD, Multi-User Dungeon; MMORPG, Massively Multiplayer Online Role Playing Games; FPS, ﬁrst person shooter.

Availability of data and materials

All the data regarding the event is publicly available for researches in the Internet Archive [66]. The chat logs can have either seconds (YYYY-MM-DD HH:MM:SS) or minute (YYYY-MM-DD HH:MM) resolution. As discussed in the main text, the game went viral after the ﬁrst day. Thus, even though the game started on February 12, 2014 at 23:16:01 UTC, the ﬁrst logs recorded correspond to February 14, 2014 at 08:16:19 GMT+1. Besides, the data between February 21, 2014 at 04:25:54 GMT+1 and 07:59:22 GMT+1 is missing, as it can be seen in Fig.5(D). Additionally, the whole event was recorded in video and is also available in the Internet Archive [66]. The position of the tug-of-war meter as well as the game mode active at each time were extracted from those videos using optical character recognition techniques.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

AA and YM designed the research. AA performed the research. AA and YM analyzed the data and wrote the paper. All authors read and approved the ﬁnal manuscript.

Author details

1_{Institute for Biocomputation and Physics of Complex Systems, University of Zaragoza, Zaragoza, Spain.}2_{Department of}

Theoretical Physics, University of Zaragoza, Zaragoza, Spain.3_{ISI Foundation, Turin, Italy.}

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional aﬃliations.

Received: 25 January 2019 Accepted: 30 May 2019

References

1. Abrams D, Hogg MA (2001) Collective identity: group membership and self-conception. In: Blackwell handbook of social psychology: group processes, pp 425–460

2. Le Bon G (1895) The crowd: a study of the popular mind

3. Reicher SD (2008) The psychology of crowd dynamics. In: Blackwell handbook of social psychology: group processes 4. La Macchia ST, Louis WR (2016) Crowd behaviour and collective action. In: Understanding peace and conﬂict

through social identity theory. Springer, Cham, pp 89–104

5. Reicher S, Stott C, Cronin P, Adang O (2004) An integrated approach to crowd psychology and public order policing. Policing, Int J Police Strateg Manag 27(4):558–572

6. Kozinets RV, Hemetsberger A, Schau HJ (2008) The wisdom of consumer crowds: collective innovation in the age of networked marketing. J Macromark 28(4):339–354

7. von Ahn L, Maurer B, McMillen C, Abraham D, Blum M (2008) reCAPTCHA: human-based character recognition via web security measures. Science 321(5895):1465–1468

8. Baumeister RF, Ainsworth SE, Vohs KD (2016) Are groups more or less than the sum of their members? The moderating role of individual identiﬁcation. Behav Brain Sci 39:137

9. Latané B (1981) The psychology of social impact. Am Psychol 36(4):343

10. Quinn AJ, Bederson BB (2011) Human computation: a survey and taxonomy of a growing ﬁeld. In: Proceedings of the SIGCHI conference on human factors in computing systems. CHI ’11. ACM, New York, pp 1403–1412

11. Malone TW, Laubacher R, Dellarocas C (2010) The collective intelligence genome. MIT Sloan Manag Rev 51(3):21 12. Mason W, Watts DJ (2010) Financial incentives and the performance of crowds. ACM SIGKDD Explor Newsl

11(2):100–108

13. Prendergast C (1999) The provision of incentives in ﬁrms. J Econ Lit 37(1):7–63

14. Heyman J, Ariely D (2004) Eﬀort for payment: a tale of two markets. Psychol Sci 15(11):787–793 15. Gneezy U, Rustichini A (2000) Pay enough or don’t pay at all. Q J Econ 115(3):791–810

16. Kittur A, Nickerson JV, Bernstein M, Gerber E, Shaw A, Zimmerman J, Lease M, Horton J (2013) The future of crowd work. In: Proceedings of the 2013 conference on computer supported cooperative work. CSCW ’13. ACM, New York, pp 1301–1318

17. Peer E, Brandimarte L, Samat S, Acquisti A (2017) Beyond the turk: alternative platforms for crowdsourcing behavioral research. J Exp Soc Psychol 70:153–163

18. Cox J, Oh EY, Simmons B, Lintott C, Masters K, Greenhill A, Graham G, Holmes K (2015) Deﬁning and measuring success in online citizen science: a case study of zooniverse projects. Comput Sci Eng 17(4):28–41

19. von Ahn L (2006) Games with a purpose. Computer 39(6):92–94

20. Khatib F, Cooper S, Tyka MD, Xu K, Makedon I, Popovi´c Z, Baker D, Players F (2011) Algorithm discovery by protein folding game players. Proc Natl Acad Sci USA 108(47):18949–18953

21. Bernstein M, Tan D, Smith G, Czerwinski M, Horvitz E (2009) Collabio: a game for annotating people within social networks. In: UIST ’09

22. Salk CF, Sturn T, See L, Fritz S, Perger C (2016) Assessing quality of volunteer crowdsourcing contributions: lessons from the Cropland Capture game. Int J Digit Earth 9(4):410–426

23. Birke A, Schoenau-Fog H, Reng L (2012) Space bugz!: a smartphone-controlled crowd game. In: Proceeding of the 16th international academic MindTrek conference. MindTrek ’12. ACM, New York, pp 217–219

(16)

25. Müller TF, Winters J (2018) Compression in cultural evolution: homogeneity and structure in the emergence and evolution of a large-scale online collaborative art project. PLoS ONE 13(9):0202019

26. Rappaz J, Catasta M, West R, Aberer K Latent structure in collaboration: the case of Reddit R/place.arXiv:1804.05962

27. Guinness World Records 2015 Gamer’s Edition (2014) Guinness book 28. Pokémon passes 300 million games sold (2018)https://perma.cc/DC6H-GTMV

29. Althoﬀ T, White RW, Horvitz E (2016) Inﬂuence of Pokémon Go on physical activity: study and implications. J Med Internet Res 18(12):315

30. Sjöblom M, Hamari J (2017) Why do people watch others play video games? An empirical study on the motivations of Twitch users. Comput Hum Behav 75:985–996

31. Twitch is 4th in Peak US Internet Traﬃc (2018)https://perma.cc/5V9Y-YYPG

32. Churchill BCB, Xu W (2016) The modem nation: a ﬁrst study on Twitch.TV social structure and player/game relationships. In: 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications (SustainCom)

(BDCloud-SocialCom-SustainCom), pp 223–228

33. Flynn-Jones E (2015) Well played, vol 3. ETC Press, Pittsburgh 34. Twitch Plays Pokémon timeline (2018)https://perma.cc/5V9Y-YYPG

35. Time needed to ﬁnish Pokémon Red (2018)https://perma.cc/6Y69-76KH

36. Lindsey M-V (2015) Religion in digital games relodaded, vol 7. Institute for Religious Studies, University of Heidelberg, Heidelberg, Chap. 6

37. Centola D, Baronchelli A (2015) The spontaneous emergence of conventions: an experimental study of cultural evolution. Proc Natl Acad Sci USA 112(7):1989–1994

38. Kirman B, Lineham C, Lawson S (2012) Exploring mischief and mayhem in social computing or: how we learned to stop worrying and love the trolls. In: CHI ’12 extended abstracts on human factors in computing systems. ACM, New York, pp 121–130

39. Paul HL, Bowman ND, Banks J (2015) The enjoyment of grieﬁng in online games. J Gaming Virtual Worlds 7(3):243–258

40. Buckels EE, Trapnell PD, Paulhus DL (2014) Trolls just want to have fun. Pers Individ Diﬀer 67:97–102 41. A strategy to traverse the ledge (2019)https://perma.cc/FFK8-9CHG

42. Sunstein CR (1999) The law of group polarization. J Polit Philos 10(2):175–195

43. Conover M, Ratkiewicz J, Francisco MR, Gonçalves B, Menczer F, Flammini A (2011) Political polarization on Twitter. ICWSM 133:89–96

44. Galton F (1907) Vox populi (the wisdom of crowds). Nature 75(7):450–451

45. Surowiecki J (2004) The wisdom of crowds: why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations, vol 296. Anchor Books, New York

46. Couzin ID, Krause J, Franks NR, Levin SA (2005) Eﬀective leadership and decision-making in animal groups on the move. Nature 433(7025):513

47. Dyer JR, Ioannou CC, Morrell LJ, Croft DP, Couzin ID, Waters DA, Krause J (2008) Consensus decision making in human crowds. Anim Behav 75(2):461–470

48. Muchnik L, Aral S, Taylor SJ (2013) Social inﬂuence bias: a randomized experiment. Science 341(6146):647–651 49. Rosenberg LB (2015) Human swarming, a real-time method for parallel distributed intelligence. In: 2015

swarm/human blended intelligence workshop (SHBI), pp 1–7

50. Rosenberg L, Baltaxe D, Pescetelli N (2016) Crowds vs swarms, a comparison of intelligence. In: 2016 swarm/human blended intelligence workshop (SHBI), pp 1–4

51. Lee RLM (2017) Do online crowds really exist? Proximity, connectivity and collectivity. Distinktion J Soc Theory 18(1):82–94

52. Rand DG, Peysakhovich A, Kraft-Todd GT, Newman GE, Wurzbacher O, Nowak MA, Greene JD (2014) Social heuristics shape intuitive cooperation. Nat Commun 5:3677

53. Yamagishi T, Matsumoto Y, Kiyonari T, Takagishi H, Li Y, Kanai R, Sakagami M (2017) Response time in economic games reflects different types of decision conflict for prosocial and proself individuals. Proc Natl Acad Sci USA 114:6394–6399

54. Kameda T, Tsukasaki T, Hastie R, Berg N (2011) Democracy under uncertainty: the wisdom of crowds and the free-rider problem in group decision making. Psychol Rev 118(1):76–96

55. Hung AA, Plott CR (2001) Information cascades: replication and an extension to majority rule and conformity-rewarding institutions. Am Econ Rev 91(5):1508–1520

56. White RW (1959) Motivation reconsidered: the concept of competence. Psychol Rev 66(5):297–333

57. Klimmt C, Hartmann T, Frey A (2007) Eﬀectance and control as determinants of video game enjoyment. CyberPsychol Behav 10(6):845–848

58. Mekler ED, Bopp JA, Tuch AN, Opwis K (2014) A systematic review of quantitative studies on the enjoyment of digital entertainment games. In: Proceedings of the 32nd annual ACM conference on human factors in computing systems. ACM, New York, pp 927–936

59. Bartle R (1996) Hearts, clubs, diamonds, spades: players who suit muds

60. van den Hoogen W, Poels K, IJsselsteijn W, de Kort Y (2012) Between challenge and defeat: repeated player-death and game enjoyment. Media Psychol 15(4):443–459

61. Spears R, Postmes T (2015) Group identity, social inﬂuence and collective action online: extensions and applications of the SIDE model. In: Sundar S (ed) Handbooks in communication and media. Wiley–Blackwell, Chichester, pp 23–46 62. Nylund A, Landfors O (2015) Frustration and its eﬀect on immersion in games: a developer viewpoint on the good

and bad aspects of frustration. DIVA

63. The rise of masocore gaming (2018)https://perma.cc/HKD3-KP2Q

64. Games with Busted Physics and Controls (2018)https://perma.cc/Z33L-NUSW

65. Ramirez D, Saucerman J, Dietmeier J (2014) Twitch plays pokemon: a case study in big g games. In: Proceedings of DiGRA 2014, Snowbird, UT