1985; Fischer, Lynch, Paterson MAURICEHERLIHY
Department of Computer Science, Brown University, Providence, RI, USA
Keywords and Synonyms Wait-free consensus; Agreement
Asynchronous Consensus Impossibility
A
71 Problem DefinitionConsider a distributed system consisting of a set ofpro- cesses that communicate by sending and receiving mes- sages. The network is a multiset of messages, where each message is addressed to some process. A process is a state machine that can take three kinds ofsteps.
In asendstep, a process places a message in the net- work.
In areceivestep, a processAeither reads and removes from the network a message addressed toA, or it reads a distinguished null value, leaving the network un- changed. If a message addressed toAis placed in the network, and ifA subsequently performs an infinite number of receive steps, thenAwill eventually receive that message.
In acomputationstate, a process changes state without communicating with any other process.
Processes areasynchronous: there is no bound on their rel- ative speeds. Processes cancrash: they can simply halt and take no more steps. This article considers executions in which at most one process crashes.
In the consensus problem, each process starts with a privateinputvalue, communicates with the others, and then halts with adecisionvalue. These values must satisfy the following properties:
Agreement:all processes’ decision values must agree.
Validity:every decision value must be some process’ in- put.
Termination:every non-fault process must decide in a finite number of steps.
Fischer, Lynch, and Paterson showed that there is no pro- tocol that solves consensus in any asynchronous message- passing system where even a single process can fail. This result is one of the most influential results in Distributed Computing, laying the foundations for a number of subse- quent research efforts.
Terminology
Without loss of generality, one can restrict attention tobi- naryconsensus, where the inputs are 0 or 1. Aprotocol stateconsists of the states of the processes and the multi- set of messages in transit in the network. Aninitial state
is a protocol state before any process has moved, and afi- nal stateis a protocol state after all processes have finished. Thedecision valueof any final state is the value decided by all processes in that state.
Any terminating protocol’s set of possible states forms a tree, where each node represents a possible protocol state, and each edge represents a possible step by some process. Because the protocol must terminate, the tree is
finite. Each leaf node represents a final protocol state with decision value either 0 or 1.
Abivalentprotocol state is one in which the eventual decision value is not yet fixed. From any bivalent state, there is an execution in which the eventual decision value is 0, and another in which it is 1. Aunivalentprotocol state is one in which the outcome is fixed. Every execution start- ing from a univalent state decides the same value. A1-va- lentprotocol state is univalent with eventual decision value 1, and similarly for a0-valentstate.
A protocol state iscriticalif
It is bivalent, and
If any process takes a step, the protocol state becomes univalent.
Key Results
Lemma 1 Every consensus protocol has a bivalent initial state.
Proof Assume, by way of contradiction, that there ex- ists a consensus protocol for (n+ 1) threadsA0; ;An
in which every initial state is univalent. Letsibe the ini-
tial state where processes Ai; ;An have input 0 and A0; : : : ;Ai1have input 1. Clearly,s0is 0-valent: all pro-
cesses have input 0, so all must decide 0 by the validity condition. Ifsi is 0-valent, so issi+1. These states differ
only in the input to processAi: 0 insi, and 1 insi+1. Any
execution starting fromsiin whichAihalts before taking
any steps is indistinguishable from an execution starting fromsi+1in whichAihalts before taking any steps. Since
processes must decide 0 in the first execution, they must decide 1 in the second. Since there is one execution start- ing fromsi+1that decides 0, and sincesi+1is univalent by
hypothesis,si+1is 0-valent. It follows that the statesn+1, in
which all processes start with input 1, is 0-valent, a contra-
diction.
Lemma 2 Every consensus protocol has a critical state. Proof by contradiction. By Lemma 1, the protocol has a bivalent initial state. Start the protocol in this state. Re- peatedly choose a process whose next step leaves the pro- tocol in a bivalent state, and let that process take a step. Either the protocol runs forever, violating the termination condition, or the protocol eventually enters a critical state.
Theorem 3 There is no consensus protocol for an asyn- chronous message-passing system where a single process can crash.
Proof Assume by way of contradiction that such a proto- col exists. Run the protocol until it reaches a critical state
72
A
Asynchronous Consensus Impossibilitys. There must be two processesAandBsuch thatA’s next step carries the protocol to a 0-valent state, andB’s next step carries the protocol to a 1-valent state.
Starting froms, letsAbe the state reached ifAtakes the
first step,sBifBtakes the first step,sABifAtakes a step
followed byB, and so on. StatessAandsABare 0-valent,
whilesBandsBAare 1-valent. The rest is a case analysis.
Of all the possible pairs of stepsAandBcould be about to execute, most of themcommute: statessABandsBAare
identical, which is a contradiction because they have dif- ferent valences.
The only pair of steps that do not commute occurs whenAis about to send a message toB(or vice versa). LetsABbe the state resulting ifAsends a message toBand Bthen receives it, and letsBAbe the state resulting ifBre-
ceives a different message (ornull) and thenAsends its message to B. Note that every process other than Bhas the same local state in sAB andsBA. Consider an execu-
tion starting fromsABin which every process other than Btakes steps in round-robin order. BecausesAB is 0-va-
lent, they will eventually decide 0. Next, consider an exe- cution starting fromsBAin which every process other than Btakes steps in round-robin order. BecausesBAis 1-valent,
they will eventually decide 1. But all processes other than
Bhave the same local states at the end of each execution, so they cannot decide different values, a contradiction. In the proof of this theorem, and in the proofs of the preceding lemmas, we construct scenarios where at most a single process is delayed. As a result, this impossibility result holds for any system where a single process can fail undetectably.
Applications
The consensus problem is a key tool for understanding the power of various asynchronous models of computation. Open Problems
There are many open problems concerning the solvabil- ity of consensus in other models, or with restrictions on inputs.
Related Work
The original paper by Fischer, Lynch, and Paterson [8] is still a model of clarity.
Many researchers have examined alternative models of computation in which consensus can be solved. Dolev, Dwork, and Stockmeyer [5] examine a variety of alterna- tive message-passing models, identifying the precise as-
sumptions needed to make consensus possible. Dwork, Lynch, and Stockmeyer [6] derive upper and lower bounds for a semi-synchronous model where there is an upper and lower bound on message delivery time. Ben-Or [1] showed that introducing randomization makes consensus possible in an asynchronous message-passing system. Chandra and Toueg [3] showed that consensus becomes possible if in the presence of an oracle that can (unreliably) detect when a process has crashed. Each of the papers cited here has in- spired many follow-up papers. A good place to start is the excellent survey by Fich and Ruppert [7].
A protocol iswait-freeif it tolerates failures by all but one of the participants. A concurrent object implementa- tion islinearizableif each method call seems to take effect instantaneously at some point between the method’s in- vocation and response. Herlihy [9] showed that shared- memory objects can each be assigned a consensus num- ber, which is the maximum number of processes for which there exists a wait-free consensus protocol using a com- bination of read-write memory and the objects in ques- tion. Consensus numbers induce an infinite hierarchy on objects, where (simplifying somewhat) higher objects are more powerful than lower objects. In a system ofnor more concurrent processes, it is impossible to construct a lock- free implementation of an object with consensus number
nfrom an object with a lower consensus number. On the other hand, any object with consensus numbernisuni- versalin a system ofnor fewer processes: it can be used to construct a wait-free linearizable implementation of any object.
In 1990, Chaudhuri [4] introduced thek-set agreement
problem (sometimes called k-set consensus, which gen- eralizes consensus by allowingk or fewer distinct deci- sion values to be chosen. In particular, 1-set agreement is consensus. The question whetherk-set agreement can be solved in asynchronous message-passing models was open for several years, until three independent groups [2,10,11] showed that no protocol exists.
Cross References Linearizability
Topology Approach in Distributed Computing
Recommended Reading
1. Ben-Or, M.: Another advantage of free choice (extended ab- stract): Completely asynchronous agreement protocols. In: PODC ’83: Proceedings of the second annual ACM symposium on Principles of distributed computing, pp. 27–30. ACM Press, New York (1983)