More Time, More Work: How Time Limits Bias Estimates of Project Duration and Scope

(1)

More

Time,

More

Work:

How

Time

Limits

Bias

Estimates

of

Project

Duration

and

Scope

Abstract

We propose that time limits systematically bias predictions of workers’ completion times, even when the limits are uninformative and cannot affect worker’s behavior. We show evidence for this bias in controlled laboratory studies and in a field survey. We find that longer time limits contribute to a misperception that the task involves more work, even for experienced managers making estimates in a familiar setting. This scope and duration bias has important behavioral implications, including an excessive preference for flat fee compensation contracts over contracts based on time spent working.

Keywords: Deadlines, Time Judgments, Scope Perception, Over‐learned Response, Estimation Bias

Highlights:

 Judges systematically over‐estimate task completion time when time limit is longer.

 The biasing effect of time limits persists even when the time limits are completely uninformative and cannot affect workers time spent on the task.

 This happens because of changes in the perceived scope of work on account of an over‐ generalized association between longer deadlines and bigger tasks.

 The bias is shown to explain an over‐reliance on flat‐fee contracts.

(2)

Meeting time limits and deadlines is a pervasive reality for managers. The deadlines a manager faces are often externally imposed and depend on other factors besides the scope of the work to be done. For example, a firm’s clients often impose deadlines, which may in turn have been imposed on the client by pre‐determined market timetables, statutory requirements, or other factors. The manager working under these time limits then needs to make important decisions, which often involve estimating the time required by workers, such as planning internal timelines and allocating needed resources. In this paper, we demonstrate that time estimates made under externally imposed deadlines suffer from a robust and consequential bias.

For managers, how much time workers are expected to take to complete a project is often a crucial input into key decisions. Underestimating the resources needed could delay or even preclude project completion. On the other hand, over‐allocating resources may represent a substantial opportunity cost or actual out‐of‐pocket cost. Likewise, the efficiency of how resources are procured may depend on the accuracy of estimated completion time. Consider, for example, a manager who is deciding between paying a temporary contract worker a flat fee for completing a project and paying the worker a metered rate (e.g. per‐hour). If the manager over‐estimates the amount of time the worker will take, the manager may agree to an expensive flat rate contract, and therefore over‐pay compared to the per‐hour cost. On the other hand, if the manger under‐estimates the amount of time the worker will take and chooses the per‐hour contract, the manager may face unexpectedly high costs.

Prescriptive models of cost‐benefit analysis assume that the manager can either accurately estimate relevant inputs, such as workers’ time, or can at least estimate an unbiased probability distribution for the time needed (Dumond & Mabert, 1988; Williams & Sugden, 1978). Foundational work in industrial and organizational psychology put a great deal of effort into defining and timing the

(3)

steps in industrial processes (e.g., Lowry, Maynard, & Stegemerten, 1940), precisely because of the importance of having accurate inputs into such decision processes.

While this approach is well‐suited for the manufacturing assembly line, it is a much more difficult task for managers in the modern information economy where tasks are often highly variable, require greater human intervention, and rely more on worker flexibility and initiative. Managers making these judgments often face insufficient objective information, either to directly base the judgment on or to facilitate learning from their own past judgments. In the absence of such information, managers may rely on intuition, heuristics, and anecdotal information from various

sources. The business press, for example, provides such generalized advice, and often touts the benefits of maintaining tightly controlled worker environments. Managers can read about the benefits of “a very tight deadline” for worker productivity (Schaffer, 2012), the dangers of extended deadlines (Halvorson, 2013) and why managers should (metaphorically) kick workers in the pants (Herzberg, 1986). How accurately would these managers, operating under a short vs. long external deadline, estimate the time workers will take to complete a project, a potentially crucial input into their decisions?

In this paper, we address this important question, unanswered by the prior literature. We investigate how decision makers’ subjective estimates of workers’ completion times for a task are affected by a time limit. We find that people provide higher estimates for the time to complete a task when the time limit for completion is longer. In our studies, estimating longer completion times for later deadlines, while potentially normative in some settings, persists even when the time limits are not informative and cannot influence workers’ actual completion times. We propose and test a novel scope

perception account of our findings, in which longer time limits increase the perceived scope of the work. We distinguish our account from a motivation‐based account in which managers believe that shorter time limits increase workers’ pace, and rule out multiple alternative explanations, including numeric

(4)

heuristics such as anchoring. We test our account in five studies, using both lay subjects and experienced managers, familiar and novel tasks, and both hypothetical scenarios and incentive‐ compatible tasks.

Estimating Task‐Completion Time

Estimating task‐completion time is a duration judgment of a prospective event, and such judgments have been found to often be biased (over or under‐estimated) and extremely malleable (see Roy, Christenfeld, & McKenzie, 2005 and Halkjelsvik & Jørgensen, 2012 for reviews). Prospective

duration judgments have been found to be affected by irrelevant factors, including arousal and vividness (Ahn, Liu, & Soman, 2009; Caruso, Gilbert, & Wilson, 2008) and a tendency to focus on the minute details of the target event and neglect other potential information, including past experiences (Buehler & Griffin, 2003).

Even exposure to prior instances of task completion does not ensure accurate estimates. In particular, time judgments for events that have been experienced in the past are systematically affected by contextual factors, including the interval between the occurrence of the event and time of judgment (Neter, 1970), availability of attentional resources (Block, 1992) and retrieval of cues from memory (Block, 1992; Zauberman, Levav, Diehl, & Bhargave, 2010). These biases result in inaccurate recall and make it difficult for decision makers to learn from experience (Meyvis, Ratner, & Levav, 2010).

Therefore, even experienced decision makers with access to accurate feedback may sometimes fail to utilize past completion time distributions (Buehler, Griffin, & Ross, 1994; Gruschke & Jørgensen, 2008) in making prospective time estimates.

(5)

In fact, managers may not recognize the benefit of replacing their subjective time estimates with more accurate information. The ambiguity involved in open‐ended time estimation can make these predictions particularly susceptible to over‐confidence bias (Klayman, Soll, González‐Vallejo, & Barlas, 1999), even among professionals making estimates about a familiar task (Jørgensen, Teigen, &

Moløkken, 2004) and successful time managers (Francis‐Smythe & Robertson, 1999). Collectively, these factors may make managers susceptible to an over‐reliance on flawed judgment heuristics and non‐ informative cues in the decision environment, potentially including external time limits.

Some research has suggested that time estimates are affected by time limit cues, including self‐ generated time goals in lab studies (König, 2005; Thomas & Handley, 2008), naturally occurring and experimentally manipulated deadlines (Buehler et al., 1994), as well as customers’ expected times in applied settings with experienced professionals (Aranda & Easterbrook, 2005; Grimstad & Jørgensen, 2007; Jørgensen & Sjøberg, 2004). In one field study, for example, software companies made shorter delivery time estimates with a 3‐week deadline than without (Jørgensen & Grimstad, 2010). However, prior research on the effects of time limits has not tested the accuracy of these estimates, identified the underlying process or investigated the consequences for decision‐making.

How Time Limits Can Impact Estimates

In our investigation, we will distinguish between three distinct possibilities for how time estimates under non‐diagnostic external time limits are made. First, the time limit might not affect estimates of time, beyond providing an upper bound on the maximum possible time. This would be consistent with prior research in other domains, which has demonstrated insensitivity to important but ambiguous cues, particularly when individual judgments are made in isolation, as opposed to jointly (Hsee, Loewenstein, Blount, & Bazerman, 1999; Hsee, 1996; Shen & Urminsky, 2013).

(6)

A second possibility is that people believe that shorter time limits motivate workers to complete tasks faster. While the idea that people work more slowly when more time is available to them, known as Parkinson’s Law (Parkinson, 1955), has been broadly influential, the empirical evidence is quite mixed. Lab studies have also found evidence that people spend more time when the time limit is

experimentally manipulated to be longer, particularly for open‐ended tasks where no alternative activities are available (Aronson & Landy, 1967; Brannon, Hershberger, & Brock, 1999; Bryan & Locke, 1967; Jørgensen & Sjøberg, 2001 as reported in Halkjelsvik & Jørgensen, 2012). However, other experimental studies, using close‐ended tasks, have found no significant effect of manipulated time limits on time spent working (Amabile, DeJong, & Lepper, 1976; Burgess, Enzle, & Schmaltz, 2004), and being randomly assigned to a later deadline may even lead to more procrastination, rather than more time spent on the task (Ariely & Wertenbroch, 2002).

It could be that people are well‐calibrated amateur psychologists in this domain, such that they either have learned or can infer how laxer or stricter time limits actually affect workers’ motivation. If this is the case, we would expect people’s judgments to reflect the actual relationship between completion times and time limits, such that estimates differ only to the degree that the time limits can and do affect completion times. However, people may apply this belief even when they are not well‐ calibrated as to the magnitude of the effect of specific time limits. This is consistent with the view that biases in judgments often arise from an over‐generalization of useful heuristics (Baron 1972).

Lastly, we introduce a novel third possibility: people may judge the time to complete the task primarily based on the perceived scope of the work, and it is their judgments of project scope that are influenced by time limits. If people commonly use time limits which are customized for the project scope as a cue for inferring the unobserved scope of the work, they may do so even when the time limits are not informative, such as when the time limit is externally (or even randomly) determined. As a

(7)

result, people would estimate that a task with a longer time limit will take longer, not because they think people will work slower, but because the task seems larger. Such over‐learned responses have been found in other judgment domains, even when the cues are not informative (distance based on visual clarity, Brunswik 1943; frequency based on subjective value, Dai, Wertenbroch, & Brendl 2008; value of a service based on its duration, Yeung & Soman 2007).

According to this novel scope perception account, time limits could affect managers’ estimated task completion times primarily because of their impressions of the task, rather than due to their beliefs about workers’ behavior or motivation. Thus, when time limits vary but the project scope is fixed, the time limit would bias estimates. Furthermore, the effect of time limits on estimates could persist even when the time limit cannot logically affect the worker’s pace (e.g. when the workers do not know about the specific externally‐imposed time limits the manager is aware of). This provides a crucial testable difference between our proposed cognitive account and the lay‐theory motivational accounts.

In our studies, we investigate specifically how one person (the manager) estimates the time it will take another person (the worker) to complete a task, rather than how long it would take a person to do the task themselves. Prior research has shown that the way people make completion time

predictions for themselves may be different from how they reason about the times of others (Buehler et al., 2012; but also see Roy et al. 2013), particularly when there are deadlines (Buehler et al., 1994). Further, predictions about one’s self can be influenced by factors such as motivated reasoning (Kunda, 1990), self‐presentation motives (Leary, 1996) and strategic goal setting (Locke, Latham, Smith, Wood, & Bandura, 1990). Therefore, investigating how people estimate task completion times specifically for others, under different time limits, enables us to separate our research question about biases in belief formation from confounding factors that can arise in predicting one’s own future behavior.

(8)

In the first study, we measure the accuracy of predictions about how much time other people will take to complete a real activity. The study was conducted in two phases, with two different

populations, one serving as workers completing a task, and one as judges, estimating the workers’ time.

Method: In Phase 1, participants in a laboratory (n=116) were assigned the role of workers and were all asked to solve the same digital jigsaw puzzle. Each worker was randomly assigned to one of three between‐subject conditions, and given either unlimited time, 5 minutes (a relatively short time limit), or 15 minutes (a long time limit) to complete the puzzle. All the workers were paid a flat fee of $3 in all three conditions, regardless of how long they took to solve the puzzle. They were informed about their compensation and time limit before starting the timed puzzle. In order to make sure that participants did not think that they might have to wait until time was up after solving the puzzle, they were told that they could either move on to participate in another study or leave the lab after they completed the puzzle and answered a few follow‐up questions.

The jigsaw puzzle was administered using a computer interface from the online puzzle site jigzone.com. The interface showed a timer which started counting immediately after the first piece was moved, and which stopped and continued displaying the final time when all the pieces were in place. While all participants solved the same puzzle, each participant started off with a different random arrangement of the puzzle pieces (see Appendix A for an example). As participants moved the puzzle pieces on the screen, the pieces snapped together only when the two pieces fit. As a result, the puzzle could not be solved incorrectly, ensuring the same outcome quality across workers. After each worker finished the puzzle, the completion time was recorded, and the participant was asked a few follow‐up questions about their experience participating in the study and their familiarity with jigsaw puzzles.

In Phase 2, a separate sample of online participants (n=103) were assigned the role of judges, and were provided with detailed information about the Phase 1 study, including a picture of the initial

(9)

unsolved and final solved puzzle (Appendix A). They were provided full information about all three time limit conditions the various workers faced, and the compensation in the study. The information

emphasized that each worker had been randomly assigned to one of the three time limit conditions, could not choose or influence their time limit and received the same $3 flat fee for their work in all conditions. Furthermore, judges knew workers were only doing one task in the allotted time and were free to leave as soon as they completed the study. Judges’ knowledge of these study characteristics is important to rule out inferences about the workers multi‐tasking or waiting for the time limit to elapse.

Judges were asked to predict the task completion time for an average worker under each of the three different time limits (order counter‐balanced). Each judge estimated the average completion time for one time limit and then, on a separate screen, for the other two time limits. Judges were told that they could earn a bonus of up to $1 based on how accurately they predicted the task completion time. Judges were told that one of their predictions would be randomly drawn and read about how the linear accuracy incentive would be computed from the selected prediction. After making their predictions, judges were asked a few follow‐up questions regarding their beliefs about the completion time, their assessment of workers’ feelings while solving the puzzle, and their own familiarity with jigsaw puzzles.

Results – Phase 1: Workers’ Completion Times: All workers solved the puzzle within the allotted time. The average times it took workers to solve the puzzle were similar in all three conditions (MShort= 2.24,

SD = 0.79; MLong = 2.75, SD = 1.89; MUnlimited Time=2.23, SD = 1.10; F(2,113)=1.92, p=.151). Post‐hoc tests

indicated that people completed the task marginally faster under the short and unlimited time limits than the long time limit (MShort vs. MLong: p= .09; MShort vs. MUnlimited: p= .98; MLong vs. MUnlimited: p= .09;)1.

Therefore, even when time limits were three times longer (compared to the 5 minute limit) or absent altogether, the differences in actual time taken by the workers were small.

1_As_an_additional_robustness_check,_the_findings_replicated_using_log_‐_transformed_time_estimates_to_address_the

(10)

Results – Phase 2: Judges’ Time Estimates: In contrast, using a between‐subjects comparison of the first estimates, judges expected large differences in the time it took workers with different time limits to complete the task. Overall, judges’ average estimates were significantly higher than the actual workers’ times in each of the three time limit conditions (Short Time: MJudges = 3.55 vs. MWorkers = 2.24, t(68) =

7.21, p<.001; Long Time: MJudges = 6.62 vs. MWorkers = 2.75, t(77) = 6.86, p<.001; Unlimited Time: MJudges =

5.98 vs. MWorkers = 2.23, t(68) = 6.33, p<.001). More importantly, the over‐prediction increased with

longer time limits (interaction F(2,213) = 8.36, p<.001; see Figure 1). Estimates for the short time limit condition were significantly lower than for either the long time limit (F(1,145) = 16.67, p<.001) or the unlimited‐time conditions (F(1,136) = 15.38, p<.001).

After their first estimate, each judge made two unanticipated additional time estimates on a subsequent page, for the two other time limit conditions. Within‐subjects comparisons of all three estimates replicated the findings. The average estimate for the short time limit, long time limit, and the unlimited time limit conditions were 4.03 minutes (SD= 1.60), 5.94 minutes (SD=2.65), and 6.89 minutes (SD=4.34) respectively. The estimate was significantly lower in the short time limit condition compared to both the long time limit (t(133)=5.16, p<.001) and the unlimited time limit (t(142)=5.28, p<.001) conditions. Estimates were not significantly different between the long time limit and the unlimited time limit conditions in either between or within‐subjects comparisons.

Overall, we find a substantial bias of time limits on completion time estimate, which does not seem to be explained by a lack of effort or experience on the judges’ part. The amount of time judges took to read the instructions and their self‐reported knowledge or experience with puzzles did not affect estimates or moderate the effect of time limits. Furthermore, judges were well calibrated when making estimates involving a diagnostic cue, correctly predicting that workers with low self‐rated knowledge of

(11)

puzzles would take longer to complete under each of time limits (see Appendix C for supplemental analysis).

The findings were also not well explained by the judges’ lay beliefs about the effect of time limits on workers’ pace. We asked the judges whether they thought people would work slower, faster, or at the same pace when more time was available. While the majority (70%) of judges believed that people would take longer when‐ more time was available (i.e., expressed a belief in Parkinson’s law), this belief did not moderate the effect of time limits on completion time estimates. Furthermore, the differences in judges’ beliefs about how accountable workers felt to finish the puzzle as soon as possible or about workers’ task goals (to finish quickly or to take longer and enjoy it) across the different time limit conditions did not explain the findings. Thus, the observed bias does not seem to be attributable to judges’ beliefs about how time limits affect the workers’ state of mind.

Discussion: The results of Study 1 provide an initial demonstration of the time limit bias: people erroneously estimated more time for others to complete a task when there is a longer time limit (or no time limit), compared to a shorter time limit. We replicated this finding in all additional studies we conducted (as reported in Appendix D).

One potential explanation of our findings is that the bias could have arisen, in part, from a mistaken heuristic. First, we note that since judges were presented with all three time limits, they were making their estimates under the same set of potential anchors, and a simple anchoring account (Halkjelsvik & Jørgensen 2012) cannot explain the findings.

A more subtle concern is that judges could have made the same over‐estimates in the time limit conditions that they would have made in the unlimited time condition but bounded their estimates by the time limit (Huttenlocher, Hedges, & Bradburn, 1990). If judges simply recoded all completion times above five minutes to five, for example, the estimates in the short time limit condition should be similar

(12)

to the bounded estimates in the unlimited time condition. However, even after truncating all estimates to a maximum of five minutes, judges’ estimates were significantly higher in the unlimited time limit condition than in the short time limit condition (M5Mins = 3.55, SD = 0.69; MUnlimited = 4.37, SD=0.79;

t(60)=4.35, p<.01) . We find the same result comparing data in the 15 minute time condition after bounding at 5 minutes (M15Mins = 4.50, SD = 0.86) to the short time limit condition (t(69)=4.95, p<.01).

Alternatively judges might have mentally eliminated from consideration all times greater than the time limit, possibly coding them as failing in the task and therefore not qualifying for inclusion. Judges might have then effectively reported conditional average times, based only on the subset of workers who they believed would finish by the time limit. To test both this censoring account and the bounding account above, we ran a replication study (n=88, reported in Appendix D) with more detailed elicitation. We asked judges to estimate the proportion of workers who would have completed the task in each of several time ranges, including the proportion who they thought did not complete the task in the allotted time. Judges estimated that, on average, 92% of the workers (SD= 14.80) would complete the puzzle in under 5 minutes in the short time limit condition, but only 37% of the workers (SD=25.21) would complete the puzzle in 5 minutes or less time in the longer time limit condition (t(65)=11.14, p<.001). This finding is inconsistent with the truncation and censoring accounts. Likewise, this result cannot be explained by anchoring, as both estimates were of the proportion of people, rather than of the average time.

Why did the time limits bias judges’ estimates in Study 1? The time estimates are consistent with both a motivational lay theory account (e.g., “people work faster when there is less time”) and a scope perception account, in which judges’ predictions are driven by differences in perceptions of the task itself. In general, lay theories of motivation may contribute to such biases, particularly when task completion is open‐ended. However, in our setting, we did not find support for this account. While a

(13)

majority of participants did endorse the motivational lay theory (i.e., Parkinson’s Law), those beliefs did not explain the effect of time limits on judge’s estimates. In the next study we construct a setting to directly test the motivational lay theory account.

Study 2: Budgeting Game with Irrelevant Time Limits

Method: This study was conducted in a classroom setting in two sessions (one for each condition) using both verbal and written instructions. The participants (n=33) were under‐graduate students at a large mid‐western university who each participated in one session of the experiment as part of an Economics course requirement and who could earn an additional bonus based on their performance in the game.

Participants played the role of judges (e.g., project managers) in a budget‐setting exercise. They needed to budget for a hypothetical worker, who was paid a constant wage rate of 10 cents per minute for the time taken to finish the job, to paint a 20 feet by 10 feet wall. In the scenario, the organization had set a time limit to complete the project – either a short time limit (60 minutes) or a long time limit (120 minutes), varied between subjects – that the hypothetical worker did not know about.

Judges were then asked to budget for the task, by choosing how much money to allocate for the worker’s compensation (based on the time to complete the project and the constant wage rate) from the $12.00 available. Judges were incentivized to not over‐budget or under‐budget. They would earn more if they had budgeted less and the project was still completed (see Appendix A for the complete instructions provided to judges). However, if they budgeted less money than turned out to be necessary, the participant, having “failed” the budgeting exercise, would not receive any bonus. The judges were informed that the worker’s time to complete the task would be determined by drawing a number randomly from a uniform distribution between 30 minutes and 90 minutes.

(14)

Judges therefore had an incentive to provide as low a time estimate as possible (i.e. by budgeting as low an amount as possible) without under‐guessing (in which case they would not be eligible for any bonus at all). Most importantly, since the outcome depended only on the “worker time” randomly drawn from a known distribution, the optimal strategy was completely independent of the time limit. As shown in Appendix B, the optimal bid for risk‐neutral judges in either time limit condition (60 minutes or 120 minutes) was the same, $4.63, which corresponds to predicting that the worker would take approximately 46 minutes. This guess would have earned the judges a bonus of $0.88 in the game, on average.

Results: Comprehension checks suggested that judges understood that the final completion times were drawn from a uniform distribution between 30 minutes to 90 minutes2. Although the optimal bid, (based on the information known to the judges) was the same in both conditions, judges bid significantly less in the short time limit condition than in the long time limit condition, implying a lower time estimate

(Mshort = $5.26, SD=0.98; Mlong = $6.09, SD=1.33; t(31) = 2.05, p <0.05; see Figure 2). In both the time

limit conditions, the average bid was also significantly higher than the optimal bid of $4.63 (Short time limit: t(18)=2.82, p=.01; Long time limit: t(13)=4.08, p=.001). Therefore, the longer time limit influenced judges to budget more money for the task, even though they knew that the hypothetical workers were not aware of the time limit, and the time limit did not even affect the randomly drawn time used to determine the bonus.

Discussion: This study suggests that the influence of time limits on managers’ time estimates can affect their decisions (e.g. budgeting), even when the time limits are completely irrelevant to the decision. Since painting is an open‐ended task, it could be that judges thought about a higher quality of work, requiring more time, when the time limit was longer. Given that the optimal strategy was to simply

2_In_the₆₀_‐_minute_time_‐_limit_condition_there_were₅_judges_whose_bids_represented_a_completion_time_of_more

than 60 minutes, contrary to the directions. We truncated their bids to the maximum time available for the reported analysis, and we get the same results even if we discard these participants from the analysis.

(15)

solve the math problem the task implied, this would have been a bias as well. Furthermore, we replicated the findings in a study that again used jigsaw puzzles as a close‐ended constant‐quality task (n=74, see Appendix D), and in which the hypothetical workers did not know about the time limit. These findings cannot be explained by a motivational lay theory, since the workers were unaware of, and therefore could not be influenced by, the time limits.

Why do randomly determined time limits which workers don’t even know about lead judges to make different completion‐time estimates? We have proposed that participants may have different perceptions of the work involved under different time limits, even when the time limit does not

represent a valid cue. The judges may overgeneralize from the fact that more effortful tasks often have longer time limits, such that they infer scope of work based on even non‐diagnostic time limits.

To explain our findings, this scope perception bias would need to be robust and pervasive, occurring even when participants are provided with substantial information and engage in active deliberation. In particular judges in Study 1 were provided with detailed information about the task (e.g. they saw a picture of the puzzle and knew the number of pieces) and the bias was not moderated by the time they spent answering. Thus, a scope perception account of this finding would require that consideration of a specific time limit influences the spontaneous subjective judgment of task difficulty, potentially incorporating factors such as the similarity of the puzzle pieces or the difficulty of sorting through them to find matching pieces. This is analogous to a manager who has information about the objective quantifiable parameters of the deliverable, but whose time estimates may still depend on subjective assessment of the difficulty workers will have in completing the task.

While our results thus far are consistent with the scope perception account, we have not presented direct evidence for this possibility. In the next two studies we construct a direct test of the

(16)

scope perception account by having managers estimate an objectively quantifiable aspect of the work (e.g. number of puzzle pieces).

Study 3: Scope Estimation with Jigsaw Puzzles

Method: Online participants (n=118) acting as judges read information about told about the puzzles available on www.jigzone.com, including the range of puzzles sizes and the best and average solutions times among visitors for puzzles of various sizes (see Appendix A). Next, they read a hypothetical scenario in which one such puzzle was selected and administered to a group of students. Judges were told that the average time taken to solve the puzzle was 28 minutes. However, unlike the prior puzzle studies, in this study the judges did not know which puzzle it was or how many pieces it had. Instead, we use their estimates of the number of puzzle pieces as an objectively quantifiable measure of the perceived scope of work.

Judges were assigned to one of three time‐limit conditions. In the scenario, the same puzzle was administered to another person (the worker), who either worked on it under no time limit, or was randomly assigned (using a coin flip) to one of two conditions– a short time limit condition (30 minutes) or a long time limit condition (45 minutes). Judges were either told about the unlimited time setting only or about both the short and long time limits, depending on the condition (see Appendix A). Each judge estimated the number of pieces in the puzzle, and then, on a separate screen, the worker’s task completion time. Thus, we can quantify both the number of puzzle pieces, one aspect of the believed scope of work, and the rate of work (pieces per minute) implied by their estimates.

Results: Replicating the previous studies, judges predicted that workers would take significantly more time in the longer time limit condition than in the shorter time limit condition (MShort = 24.28, SD=5.23;

(17)

condition (MUnlimited = 31.97, SD=30.62) was not statistically different from either of the time limit

conditions.

Next, we tested whether this difference in estimated task completion time also affected beliefs about differences in the scope of work, as operationalized by estimated number of pieces in the puzzle. Judges estimated that the puzzle had significantly more pieces when the deadline was 45 minutes than when the deadline was 30 minutes (MShort = 130.83, SD=58.82; MLong = 177.25, SD=96.86; t(70)= 2.46,

p=.02), consistent with the scope of work account. Estimates in the unlimited condition (MUnlimied =

116.97, SD=62) were in between, significantly lower than in the long time limit condition (t(80)=3.43, p<.01) but not different from the short time limit condition.

Lastly, we tested whether the judges’ estimates implied that the workers would work more quickly or be less likely to procrastinate when less time was available to them. We computed the implied rate of work, the estimated number of pieces divided by estimated completion time, for each judge. The implied rate of work (pieces per minute) was only slightly higher in the shorter time limit condition and the difference was not significant (MShort=5.62, SD=3.74; MLong = 4.83, SD=2.14; t(70) = 1.11, p=0.27).

Thus, we find only a small (non‐significant) difference in the rate of work, but a large significant

difference in the perceived scope, between the longer and the shorter time limit conditions. This finding is consistent with our scope perception account, but does not support the alternative that the bias in time estimation is due to a general association between shorter time limits and faster rate of work.

As in Study 1, we confirmed that judges understood that the worker’s time limit was assigned at random (94%), and the results were unchanged if we exclude the four judges who failed this

comprehension test. Likewise, the effect of deadlines on the estimated number of puzzle pieces was not moderated by the amount of time judges took, either to read the instructions or to make their decisions, suggesting that insufficient processing cannot explain the findings.

(18)

Discussion: In study 3, we find that judges perceived a larger scope of work (number of puzzle pieces) when a workers had a longer (randomly assigned) time limit for the task. These results provide direct evidence for scope perception bias as the primary explanation for the effect of time limits on

completion time estimates. In contrast, this finding cannot be explained by quality inferences, time limits signaling private information, or an over‐generalized belief about longer time limits yielding slower work. In the next study, we directly elicit beliefs about rate of work and replicate our findings among experienced managers making estimates in a naturalistic setting (a direct marketing campaign).

Study 4: Scope Perception Bias in Managerial Decision Making

Method: In an artifactual field experiment (Harrison & List, 2004), we interviewed managers of small‐to‐ medium businesses (SMB, under 100 employees), who were responsible for deciding printing needs for their companies (n=203, recruited from a paid online panel as part of an unrelated survey). A sizable subset of the managers (35%) indicated that they had prior experience not only in purchasing printing services, but specifically in running direct marketing campaigns.

After completing survey questions about their use of office printing services, participants read a scenario in which they were asked to imagine that they had hired a third‐party vendor to send out customized mailers as part of a direct marketing campaign (see Appendix A for the study materials). The vendor would use its own list of potential customers, customizing the mailers based on other

information they had about the individuals. In the scenario, after the vendor finalized the list of people to target from their database, it would take 4 weeks to customize the mailers before sending them out.

Participants were then randomly assigned to either the short (4 weeks) or long (6 weeks) time limit conditions. In the long time limit condition, the scenario elaborated that they had just come across an industry report which suggested that direct mail was least effective during late summer and they had therefore instructed the vendor to delay the mailing by 2 weeks, so that the mailers instead went out in

(19)

early fall. We reiterated that this was a last minute decision after the list of potential target customers had already been finalized, and that because of this change the vendor now had 6 weeks to customize the mailers before sending them out. This manipulation creates a later deadline without any

implications for project scope.

Participants, acting as judges, were randomly assigned to either estimate the completion time (number of weeks it would take the firm to prepare the customized mailers), or estimate the project scope (number of mailers which would be prepared), between subjects. Judges in all conditions then estimated the typical worker’s rate (the number of mailers prepared by a worker in a day). All estimates were elicited using an ordinal measure with six different numerical ranges (see Appendix A). The judges also estimated the number of prospective customers they thought were within the mailing area of the direct marketing campaign, indicated whether they personally had any prior experience with direct marketing campaigns, and provided their zip code. We merged in an estimate of population density based on census data for each participant’s zip code.

Results – Estimated Time: We used ordinal regression to test for the effect of the time limit on the participants’ estimates, which were elicited in terms of pre‐defined ranges. The subset of judges (n = 101) who were asked to estimate how long the project would take gave longer times when the time limit was longer (Interpolated means: MShort= 1.62, SD=0.80; MLong= 2.73, SD= 1.27; Wald(1)= 26.52, p

<.001). This replicates our prior finding that longer time limits lead to the expectation that the same project will take longer among an experienced set of managers making estimates in a setting familiar to them.

Could the difference in time estimates have resulted from different anticipated rates of work, either due to a belief that workers would pace themselves according to the available time or because of an over‐generalized association between longer time limits and slower rate of work? Judges in this study

(20)

directly estimated the rate of work (mailers prepared per worker per day), allowing us to test this account. While the judges did estimate a directionally slower pace of work in the longer time limit condition, the difference was not significant (MShort = 398.00 vs. MLong = 313.72, Wald(1) = 1.03, p=.31).

Furthermore, in a multivariate ordinal regression predicting project completion time, we find that shorter time limits yielded significantly lower completion time estimates ( = ‐2.11, Wald(1) =26.94, p<.001) controlling for estimated rate and there was no effect of estimated rate ( = 0.00045, Wald(1) = .876, p=.35). This suggests that the differences in judges’ time estimates under different deadline conditions cannot be explained by their belief that workers would work faster when less time was available.

Results – Estimated Scope of Work: To more directly test our proposed scope of work account, we asked the other half of the judges (n=102) to estimate the total number of mailers that would be prepared by the vendor. The judges estimated fewer mailers when the time limit was short than when two

additional weeks were available (approximately 12,000 vs. 17,000, interpolating between ranges). An ordinal regression confirms that the estimated amount of work in the shorter time limit condition was marginally less than in the longer time limit condition ( = ‐.69, Wald(1) = 3.30, p=.07). In contrast, the rate of work (mailers per worker per day) estimated by this group of managers, did not significantly depend on the time limit (MShort = 464 vs. MLong = 356, Wald(1) = 0.98, p=.32). Overall, a multivariate

ordinal regression (Appendix C) reveals that judges estimated a larger scope for the completed project (more mailers sent) when the time limit was longer ( =.003, Wald(1) = 20.51, p <.001), controlling for their zip code’s population density ( =0.000043, Wald(1) = 10.21, p =.001) and estimated rate of work (mailers per day, =‐1.11, Wald(1) = 7.19, p <.01).

Discussion: In study 4, we replicate our previous finding, this time among small‐business managers, that longer task completion times are estimated when a non‐informative deadline is longer. Importantly,

(21)

while 35% of the judges reported that they had prior experience of running direct marketing campaigns like the one we had described in the study scenario, having more directly related experience did not reduce the effect of time limits on estimates of task completion time, scope or rate of work.

We also provided evidence that this finding is consistent with a scope perception bias and cannot be attributed to beliefs about workers’ pace varying with the time limit. We find that a direct measure of the scope of work, number of mailers being sent, did vary with the time limit. Controlling for other factors, judges estimated a significantly larger amount of work when the deadline was longer, consistent with the scope perception account. In contrast, we did not find a significant difference in estimates of the worker’s rate of work based on the differences in time limits.

Thus far, we have focused on investigating how time limits bias estimates of completion times. This bias could have important practical implications, to the degree that the biased time estimates are then incorporated into decisions. In the next study, we examine the potential behavioral consequence of the scope perception bias for manager’s decisions about compensating workers. In many industries like retail and hospitality (Rocco, 2013) and auto services (MacPherson, 2014) managers can hire workers either with a per‐unit‐time compensation plan (e.g. hourly wages, where the payment is based on amount of time spent working) or a flat fee compensation plan (where the payment is fixed, either salaried or for completing a given task, regardless of time spent). Decisions between different types of contracts are determined by the extent to which effort or output can be monitored (Hölmstrom, 1979), transactional costs of performing such monitoring and controlling activities (Williamson, 1981),

uncertainty in the environment (Prendergast, 2000), stage in the organizational life cycle (Madhani, 2010) and the potential for sorting and self‐selection of the best fit employees into an organization (Lo, Ghosh, & Lafontaine, 2011). We propose that the scope perception bias can be an important

(22)

optimally prefer flat fee contracts over time‐based contracts particularly when the time limits are longer.

Study 5: Time Estimates and Choice of Contracts

Method – Phase 1: Workers: We used the same study design as in Study 1, only varying the type of compensation. Workers (n=113) were randomly assigned to one of four conditions in a 2 (time limit: short, 5 minutes vs. long, 15 minutes) x 2 (contract type: flat fee vs. per minute) design. Workers in the flat fee conditions were paid either $1 (in the short time limit condition) or $3 (in the long time limit condition), regardless of how long it took them to complete the puzzle. Workers in the per‐minute condition were paid 25 cents per minute (rounded up to the nearest minute) for the time taken.

Results – Phase 1: Workers’ Completion Times and Estimates: As in Study 1, we do not find a significant main effect of time limit on time taken to solve the puzzle (MShort= 2.46, SD=1.24, MLong= 2.87, SD=2.12,

t(111)=1.26, p=0.21)3. However, we did observe a significant main effect of contract type, with per‐ minute workers taking longer to solve the puzzle than flat fee workers (MFlat Fee = 2.16, SD=0.98, MPer‐

Minute Fee = 3.19, SD=2.15, t(111)=3.31, p<.01). Regardless of contract type, there was no significant effect

of time limit on the time taken by the workers (Flat Fee: MShort = 2.14, SD=1.15, MLong= 2.19, SD=0.79;

t(56)=.19, p=.85; Per‐Minute Fee: MShort= 2.81, SD=1.25, MLong= 3.60, SD=2.77; t(53)=1.35, p=0.18;

interaction F(1,109) =1.37, p=.24. Therefore, although the workers in general took more time to complete the task under per‐minute contracts, they did not take significantly more time when more time was available to them while working under either contract type.

Method – Phase 2: Judges: In Phase 2, an adult online sample (n=171) played an incentivized game in which they were employers who would receive a lump sum payment for getting a jigsaw puzzle

3_Three_workers_took_a_little_over₅_minutes_in_the_short_time_limit_condition,_and_these_were_truncated_to₅

(23)

completed. In this game, they needed to “employ” a worker to solve the puzzle for them and would incur an employee cost, which was deducted from their revenue. The judges earned the remaining money, after deducting the cost of hiring the worker, as profit, which they were paid for real.

The study used a 2(time limit: short, long) x 2(recruiting fee for flat‐rate workers: present, absent) full factorial design. Judges were randomly assigned to one of the two time limit conditions (short, 5 minutes vs. long, 15 minutes) and one of two recruiting fee conditions (extra fee for flat‐rate workers or not), and then chose which contract type (flat fee or per‐minute fee) to pay the worker. Judges knew that workers had been randomly assigned to contract types. The specific instructions that judges read in a sample condition (long‐time limit, flat fee) are shown in Appendix B.

The cost of hiring a worker with a per‐minute contract was the same in all four conditions: 25 cents per minute, rounded up to the nearest minute, for the time taken by the worker to solve the puzzle (up to $1.25 in the short time limit condition, and up to $3.75 in the long time limit condition). The cost of hiring a worker with a flat fee contract varied by condition, depending on the time limit and whether an additional recruiting fee was charged for the use of a flat fee worker. In the two no‐

recruiting‐fee conditions, the cost for hiring a flat fee worker was either $1.50 (long time limit condition) or $1 (short time limit). In the recruiting fee conditions, the cost for hiring a flat fee worker also

included an additional fee (framed as paid to the recruiting agency) of either $0.60 (long time limit, for a total of $2.10) or $0.10 (short time limit, for a total of $1.10).

The total budget available to the judges was either $2.00 (short time limit) or $4.00 (long time limit) when there was no extra fee payable to the recruiting agency for hiring a worker with a flat fee contract. In the two conditions where this recruiting fee was introduced, the employer’s budget was increased correspondingly to either $2.10 (short time limit) or $4.60 (long time limit), to equalize the maximum possible earnings (see Table 1 for Judges’ potential profits in each condition). As mentioned

(24)

earlier, the judges’ profit after deducting the cost of hiring the worker from the allotted budget was theirs to keep. Thus, judges faced a tradeoff between a known amount of profit if they chose the flat fee, or an unknown profit (which depended on how long their worker would take) if they chose the per‐ minute fee.

After reading about the worker compensation options, the judges were told that their own outcome would be based on the completion time of a randomly selected actual worker, who had solved the puzzle under the contract chosen by the judge, subject to the specified time limit. We also provided judges with information on the typical demographic profile of a worker (based on Phase 1 of the study which had already been run), and showed them the puzzle interface instructions (including two pictures of the exact puzzle) that the workers had seen. To ensure comprehension, judges were prompted to re‐ enter three critical pieces of information before they indicated their choice: the total time limit

available, the total cost of hiring a flat fee worker, and the cost per minute of hiring a per‐minute worker.

Judges then made their choice between the flat fee and per‐minute fee contract options. After their choice of contracts, judges were asked to estimate the worker’s completion time both in the contract they had chosen, and under the other (unchosen) contract type. They were also presented with a hypothetical choice between a sure amount (equal to their profit from choosing the flat fee option) and a gamble which, unbeknownst to them, was constructed from the results of Phase 1 to match the actual distribution of profits under the per‐minute contract (see Table 2). Lastly, they answered a few questions about risk aversion, cognitive ability, and knowledge of jigsaw puzzles. After all data were collected, the profit earned was computed by pairing each participant with a randomly chosen participant in the appropriate condition in Phase 1, and the money was paid to each judge.

(25)

Results – Phase 2: Judges’ Time Estimates and Contract Choices.: The recruiting‐fee conditions were designed so that the minimum profit in the per‐minute conditions for the short and long time limit were equal (unlike the non‐recruiting‐fee conditions). We found no main effects of the recruiting fee manipulation and it did not interact with any other factors (see Appendix). Since the contract choices did not depend on the worst‐case cost, we collapse across these conditions in the remaining analyses.

In the long time‐limit condition, 89% of the judges chose the flat fee contract, earning a profit of $2.50 (after paying the $0.60 fee, if applicable, and deducting worker cost). In contrast, only 51% of the judges in the 5 short time‐limit condition chose the flat fee contract, earning $1.00 (after paying a $0.10 fee, if applicable, and deducting worker cost). The time limit had a highly significant effect on the probability of choosing the flat fee contract ( 1 28.36, .001 .4

To compensate judges who chose the per‐minute fee, we randomly picked a per‐minute worker’s actual completion time for each judge and deducted the cost of that worker. Those judges who chose the per‐minute contract actually earned significantly more profit than they would have if they had chosen the flat fee contract, in both the long time limit condition (Mper‐minute = $3.67, Mflat fee =

$2.50;  = $1.17, t(9)=6.01, p<.001) as well as in the short time limit condition (Mper‐minute = $1.12, Mflat fee

= $1.00;  = $0.12; t(39)=2.68, p=.01, see Figure 3). An expected‐value maximizer with unbiased beliefs would be less likely to choose the flat fee contract in the long (vs. short) time limit condition, but our judges were instead significantly more likely to do so. In effect, judges choosing the flat‐fee contract in the long‐time limit condition paid an almost 50% premium to insure themselves against the unlikely outcome that their earnings would be lower because their per‐minute worker took too long.

After making their contract choice, the judges estimated the task completion times for workers with the chosen contract type. Replicating the prior studies, judges estimated significantly more time in

4

(26)

the long time limit conditions than in the short time limit conditions (MShort= 3.38, SD=0.86; MLong= 7.22,

SD=3.26; t(169)= 10.30, p<.001). Subsequently, judges were asked to estimate the average worker’s task completion time under the alternative contract type (the one they had not chosen). Nearly all participants chose the option that would have provided a higher profit if their time estimates had been correct (84% in the short vs. 91% in the long time condition). These findings suggest that judges chose unnecessary fixed‐fee contracts, foregoing potential profits, primarily because they misestimated task completion times. As shown in Figure 4 estimated time for per‐minute workers fully mediated the effect of deadlines on contract choices. Controlling for per‐minute worker time estimates, time limits do not significantly impact contract choices.

Could these findings be attributed to a lack of attention or consideration among some judges? We measure judges’ depth of reasoning (Cognitive Reflection Test, Frederick 2005), and the CRT scores did not moderate the effect of deadlines on the type of contract chosen. Likewise, the amount of time judges took and their self‐reported knowledge and experience with jigsaw puzzles did not moderate the effect. Thus, we find no evidence that the findings are driven by lack of attention or consideration.

In studying these contract choices, it is also important to consider the potential role of risk preferences as an alternative account. The contract choice represents a gamble between a fixed profit under the flat‐fee contract and an uncertain profit that could be either higher or lower with the per‐ minute contract (Grund & Sliwka, 2010). Even judges making well‐calibrated estimates could have been willing to sacrifice expected earnings for reduced risk, preferring the “sure bet” flat fee contract when faced with the riskier long time limit. To test this, we had the judges make a hypothetical choice

between a fixed amount (equivalent to the profit with the flat fee contract) and a gamble constructed to be equivalent to the actual probabilities and outcomes represented by the per‐minute contract (i.e., calculated from workers’ times in Phase 1, see Table 2). The choice was represented as a separate

(27)

hypothetical gamble, unrelated to the employment game, and judges were not told that the gamble was equivalent to their contract choice.

Judges choices differed between the contract choice and the equivalent gamble, contrary to the risk aversion account. In the long time limit condition, were more likely (89%) of the judges chose the flat fee contract that ensured a certain payment of $2.50, but significantly fewer (61%) chose the certain amount of $2.50 rather than the gamble equivalent of the per‐minute contract (McNemar's 1

19.86, .001). In the short time limit condition, marginally more judges chose the certain option in

the gamble than chose the equivalent flat fee option in the contract choice (MFlat Fee = 51%, MCertain Amount

= 37%, McNemar's 1 3.44, .06, see Figure 5). A logit model confirms that significantly more judges chose the flat‐fee contract under the longer time limit, even after controlling for risk preferences via the equivalent gamble chosen (b=1.89, Z=4.51, p<.001). We also found no interaction between a separate measure of risk aversion and the effect of deadline on type of contract chosen. Thus the

preference for flat fee contracts in the long time limit condition cannot be explained by risk preferences.

Discussion: The results of study 5 demonstrate a strong preference for a flat fee contract under longer deadlines, driven by a time‐limit bias in completion time estimates and resulting overestimate for the cost of hiring a per‐minute worker. We rule out alternative explanations of the preference for flat fee contracts under longer time limits, including risk preferences, inattention and miscomprehension. A further concern might be that judges often hire multiple temporary workers at once, rather than just one. Could the differences in behavior among workers eliminate this behavioral consequence of the scope perception bias? In a follow‐up study, judges chose which contract to use to hire 50 puzzle‐ solving workers (Study S6 in Appendix D), and we strongly replicate our findings. This provides evidence that the implications of the scope perception bias for contract decisions are also robust.

(28)

Our findings suggest that most judges choose the contract that they believe has higher expected value, conditional on their (upward biased) worker completion time estimates. This suggests that the effect of time limits on contract preferences would be eliminated when the flat‐fee contract is

sufficiently expensive. In an additional study (n=83), we tested a flat fee of $3.00 (i.e. doubling the cost of hiring flat fee workers) in the long time limit condition. In this case, the preference for flat fee was eliminated, with only 29% of judges choosing the flat fee. Thus, once the cost‐disparity was high enough that the flat fee contract would be less profitable even under their biased time estimates, judges showed a preference for the per‐minute rate. This further suggests that judges’ choices were not based on an aversion to per‐minute contracts, but instead arose from an optimization with flawed inputs.

Given that experienced managers’ estimates are susceptible to the effect of time limits, as shown in Study 4, we would expect these findings to extend to actual managers. In a small‐scale replication of Study 5, we presented the long‐time limit scenario to 40 MBA students who all had some prior managerial experience. Among this more experienced population, predictions of the average time for a per‐minute worker were significantly higher than for a flat fee worker (MPer‐Minute=12.28 vs. MFlat‐ Fee=9.21; t(39)=3.74, p<.001). Correspondingly, 85% of people chose the less profitable flat fee contract,

even though only 43% preferred the equivalent sure gamble (McNemar's 1 11.13, .001).

General Discussion

Across five experiments, we provide evidence that people are more likely to overestimate task completion time for others when more time is available, even when the external deadline is completely non‐diagnostic. We provide evidence that this effect is largely due to an over‐generalized association between time limits and task scope, such that a decision maker perceives a greater scope of work when the deadline is longer. This over‐learned response persists even among experienced managers and can

(29)

contribute to inefficient contract choices. These findings are extremely robust, and have been replicated in all the data collected (reported in full in the Appendix), including for estimates of everyday activities.

In our studies, we rule out multiple alternative accounts and confounds. Judges made estimates in contexts where there was no effect of time limits on the rate of work (Study 1), or where the optimal decision was independent of the time limit information (Study 2). In most of our studies, the judges knew about all the time limits conditions, precluding a simple anchoring account (Halkjelsvik & Jørgensen 2012). We also rule out various distributional‐heuristic accounts by eliciting judges’ beliefs about the distribution of workers’ task completion times, including their beliefs about the proportion of workers who would not complete in the allotted time (Study 1).

Both our general findings and the direct measurement of task scope (Studies 3 and 4) are consistent with our proposed scope perception account, in which tasks with longer time limits are seen as involving more work. This effects even when judges were informed that the time limit was

randomized (Studies 2 and 3), or that it was varied due to an incidental reason, after the scope of work had been fixed (Study 4), which addresses a potential information signal of time limits. We also rule out a motivational lay theory account, in which time estimates are based on beliefs about how time limits affect workers’ pace, both by using unknown time limits (Study 2) and measuring the believed rate of work (Studies 3 and 4).

Our studies have focused on how estimates of others’ task completion times are affected by deadlines. When tasks are delegated to others, prior research (Burson, Faro, & Rottenstreich, 2010) has suggested that there can be systematic differences between how managers and workers evaluate incentives, due to differences in the inputs to that evaluation, such as probability of completion and subjective appeal of the incentive. In the domain of temporal judgments, prior research has generally found minimal differences between observers’ and actors’ judgments (Roy, Christenfeld & Jones 2013),