Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 60 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
60
Dung lượng
642,02 KB
Nội dung
468 COMPUTING IN PRESENCE OF FAULTS 3. (ø = α = β = ø) corruption: a message is sent by x to y at time t, but one with different content is received by y at time t + 1. While the nature of omissions and corruptions is quite obvious, that of additions may appear strange and rather artificial at first. Instead, it describes a variety of situations. The most obvious one is when sudden noise in the transmission channel is mistaken for a message. However, the more important occurrence of additions in sytems is rather subtle: When we say that the received message “was not transmitted,” what we really mean is that it “was not transmitted by any authorized user.” Indeed, additions can be seen as messages surreptitiously inserted in the system by some outside, and possibly malicious, entity. Spam being sent from an unsuspecting site clearly fits the description of an addition. Summarizing, additions do occur and can be very dangerous. These three types of faults are quite incomparable with each other in terms of danger. The hierarchy of faults comes into place when two or all of these basic fault types can occur in the system (see Figure 7.2). The presence of all three types of faults creates what is called a Byzantine faulty behavior. Notice that most localized and permanent failures can be easily modeled by com- munication faults; for instance, omission of all messages sent by and to an entity can be used to describe the crash failure of that entity. Analogously, with enough dynamic communication faults of the appropriate type, it is easy to describe faults such as send and receive failures, Byzantine link failures, and so forth. In fact, with at most 2(n − 1) dynamic communication faults per time unit, we can simulate the interaction of one faulty entity with its neighbors, regardless of its fault type (Exercise 7.10.39). As in the previous section, we will concentrate on the Agreement Problem Agree(p). The goal will be to determine if and how a certain level of agreement (i.e., value of p) can be reached in spite of a certain number F of dynamic faults of a given type τ occurring at each time unit; note that, as the faults are mobile, the set of faulty communications may change at each time unit. Depending on the value of parameter p, we have different types of agreement problems. Of particular interest are unanimity (i.e., p = n) and strong majority (i.e., k = n 2 +1). Note that any Boolean agreement requiring less than a strong majority (i.e., p ≤ n/2) can be trivially reached without any communication, for example, each entity chooses its input value. We are interested only in nontrivial agreements (i.e., p> n/2). 7.8.2 Limits to Number of Ubiquitous Faults for Majority The fact that dynamic faults are not localized but ubiquitous makes the problem of designing fault-tolerant software much more difficult. The difficulty is further increased by the fact that dynamic faults may be transient and not permanent (hence harder to detect). UBIQUITOUS FAULTS 469 Let us examine how much more difficult it is to reach a nontrivial (i.e., p> n 2 ) agreement in presence of dynamic communication faults. Consider a complete network. From the results we have established in the case of entity failures, we know that if only one entity crashes, the other n − 1 can agree on the same value (Theorem 7.3.1). Observe that with 2(n − 1) omissions per clock cycle, we can simulate the crash failure of a single entity: All messages sent to and from that entity are omitted at each time unit. This means that if 2(n −1) omissions per clock cycle are localized to a single entity all the time, then agreement among n − 1 entities is possible. What happens if those 2(n − 1) omissions per clock cycle are mobile (i.e., not localized to the same entity all the time)? Even in this case, at most a single entity will be isolated from the rest at any one time; thus, one might still reasonably expect that an agreement among n −1 entities can be reached even if the faults are dynamic. Not only this expectation is false, but actually it is impossible to reach even strong majority (i.e., an agreement among n/2+1 entities). This resultsinan instance of a more generalresult that wewill be goingto deriveand examine in this section. As a consequence, in a network G = (V,E) with maximum node degree deg(G), 1. with deg(G) omissions per clock cycle, strong majority cannot be reached; 2. if the failures are any mixture of corruptions and additions, the same bound deg(G) holds for the impossibility of strong majority; 3. In the case of arbitrary faults (omissions, additions, and corruptions: the Byzan- tine case), strong majority cannot be reached if just deg(G)/2 transmissions may be faulty. Impossibility of Strong Majority The basic result yielding the desired impos- sibility results for even strong majority is obtained using a “bivalency” technique similar to the one emplyed to prove the Single-Fault Disaster. However, the environ- ment here is drastically different from the one considered there. In particular, we are nowinasynchronous environment with all its consequences; in particular, delays are unitary; therefore, we cannot employ (to achieve our impossibility result) arbitrarily long delays. Furthermore, omissions are detectable! In other words, we cannot use the same arguments, the resources at our disposal are more limited, and the task of proving impossibility is more difficult. With this in mind, let us refresh some of the terminology and definitions we need. Let us start with the problem. Each entity x has an input register I x , a write- once output register O x , and unlimited internal storage. Initially, the input register of an entity is a value in {0, 1}, and all the output registers are set to the same value b/∈{0, 1}; once a value d x ∈{0, 1} is written in O x , the content of that register is no longer modifiable. The goal is to have at least p>n/2 entities set, in finite time, their output registers to the same value d ∈{0, 1}, subject to the nontriviality condition (i.e., if all input values are the same, then d must be that value). 470 COMPUTING IN PRESENCE OF FAULTS The values of the registers and of the global clock, together with the program counters and the internal storage, comprise the internal state of an entity. The states in which the output register has value v ∈{0, 1}are distinguished as being v-decision- states. A configuration of the system consists of the internal state of all entities at a given time. An initial configuration is one in which all entities are in an initial state at time t = 0. A configuration C has decision value v if at least p entities are in a v-decision state, v ∈{0, 1}; note that as p>n/2, a configuration can have at most one decision value. At any time t, the system is in some configuration C, and every entity can send a message to any of its neighbors. What these messages will contain depends on the protocol and on C. We describe the messages by means of a message array ⌳(C) composed of n 2 entries defined as follows: If x i and x j are neighbors, then the entry ⌳(C)[i, j] contains the (possibly empty) message sent by x i to x j ;ifx i and x j are not neighbors, then we denote this fact by ⌳(C)[i, j] =∗, where ∗is a distinguished symbol. In the actual communication, some of these messages will not be delivered or their content will be corrupted, or a message will arrive when none has been sent. Wewill describewhat happensbymeans of another n × narray called transmission matrix τ for ⌳(C) and defined as follows: If x i and x j are neighbors, then the entry τ [i, j] of the matrix contains the communication pair (α, β), where α = ⌳(C)[i, j] is what x i sent and β is what x j actually receives; if x i and x j are not neighbors, then we denote this fact by τ [i, j ] = (∗, ∗). Where no ambiguity arises, we will omit the indication C from ⌳(C). Clearly, because of the different number and types of faults and different ways in which faults can occur, many transmission matrices are possible for the same ⌳.We will denote by T (⌳) the set of all possible transmission matrices τ for ⌳. Once the transmission specified by τ has occurred, the clock is incremented by one unit to t + 1; depending on its internal state, on the current clock value, and on the received messages; each entity x i prepares a new message for each neighbor x j and enters a new internal state. The entire the system enters a new configuration τ {C}. We will call τ an event and the passage from one configuration to the next a step. Let R 1 (C) = R(C) ={τ{C} : τ ∈ T (⌳(C))} be the set of all possible configura- tions resulting from C in one step, sometimes called succeeding configurations of C. Generalizing, let R k (C) be the set of all possible configurations resulting from C in k>0 steps and R ∗ (C) ={C : ∃t>0,C ∈ R t (C)} be the set of configurations reachable from C. A configuration that is reachable from some initial configuration is said to be accessible. Let v ∈{0, 1}. A configuration C is v-valent if there exists a t ≥ 0 such that all C ∈ R t (C) have decision value v, that is, a v-valent configuration will always result in at least K entities deciding on v. A configuration C is bivalent if there exist in R ∗ (C) both a 0-valent and a 1-valent configuration. If two configurations C and C differ only in the internal state of entity x j ,wesay that they are j -adjacent, and we call them adjacent if they are j -adjacent for some j. UBIQUITOUS FAULTS 471 We will be interested in sets of events (i.e., transmission matrices) that preserve adjacency of configurations. We call a set S of events j-adjacency preserving if for any two j-adjacent configurations C and C there exist in S two events τ and τ for l(C ) and l(C ), respectively such that τ (C ) and τ (C ) are j -adjacent. We call S adjacency preserving if it is j -adjacency preserving for all j. A set S of events is continuous if for any configuration C and for any τ ,τ ∈ S for ⌳(C), there exists a finite sequence τ 0 , ,τ m of events in S for l(C) such that τ 0 = τ ,τ m = τ , and τ i (C) and τ i+1 (C) are adjacent, 0 ≤ i<m. We are interested in sets of events with at most F faults that contain an event for all possible message matrices. A set S of events is F -admissible,0≤ F ≤ 2|E| if for each message matrix ⌳, there is an event τ ∈ S for ⌳ that contains at most F faulty transmissions; furthermore, there is an event in S that contains exactly F faulty transmissions. As we will see, any set of F -admissible events that is both continuous and j-adjacency preserving for some j will make any strong majority protocol fail. To prove our impossibility result, we are going to use two properties that follow immediately from the definitions of state and of event. First of all, if an entity is in the same state in two different configurations A and B, then it will send the same messages in both configurations. That is, let s i (C) denote the internal state of x i in C; then Property 7.8.1 For two configurations A and B, let ⌳(A) and ⌳(B) be the corres- ponding message matrices. If s j (A) = s j (B) for some entity x j , then ⌳(A)[j,1], , ⌳(A)[j,n]=⌳(B)[j,1], , ⌳(B)[j,n]. Next, if an entity is in the same state in two different configurations A and B, and it receives the same messages in both configurations, then it will enter the same state in both resulting configurations. That is, Property 7.8.2 Let A and B be two configurations such that s j (A) = s j (B) for some entity x j , and let τ and τ be events for ⌳(A) and ⌳(B), respectively. Let τ [i, j] = (α i,j ,β i,j ) and τ [i, j] = (α i,j ,β i,j ). If β i,j = β i,j for all i, then s j (τ {A}) = s j (τ {B}). Given a set S of events and an agreement protocol P , let P(P,S) denote the set of all initial configurations and those that can be generated in all executions of P when the events are those in S. Theorem 7.8.1 Let S be continuous, j-adjacency preserving and F-admissible, F>0. LetPbea((n − 1)/2+2)–agreement protocol. If P(P,S) contains two accessible l-adjacent configurations, a 0-valent and a 1-valent one, then P is not correct in spite of F communication faults in S. Proof. Assume to the contrary that P isa((n −1)/2+2)–agreement protocol that is correct in spite of F>0 communication faults when the only possible events are those in S. 472 COMPUTING IN PRESENCE OF FAULTS Now let A and B be j -adjacent accessible configurations that are 0-valent and 1-valent, respectively. As S is j -adjacency preserving, there exist in S two events, π 1 for ⌳(A) and ρ 1 for ⌳(B), such that the resulting configurations π 1 {A} and ρ 1 {B}are j-adjacent. For the same reason, there exist in S two events, π 2 and ρ 2 , such that the resulting config- urations π 2 {π 1 {A}} and ρ 2 {ρ 1 {B}}are j-adjacent. Continuing to reason in this way, we have that there are in S two events, π t and ρ t , such that the resulting configura- tions π t (A) = π t {π t−1 { π 2 {π 1 {A}} }}and ρ t (A) = ρ t {ρ t−1 { ρ 2 {ρ 1 {A}} }} are j-adjacent. As P is correct, there exists a t ≥ 1 such that π t (A) and ρ t (B) have a decision value. As A is 0-valent, at least n 2 +1 entities have decision value 0 in π t (A); similarly, as B is 1-valent, at least n 2 +1 entities have decision value 1 in π t (B). This means that there exists at least one entity x i , i = j , that has decision value 0 in π t (A)and1inρ t (B); hence, s i (π t (A)) = s i (ρ t (B)). However, as π t (A) and ρ t (B) are j -adjacent, they only differ in the state of one entity, x j : a contradiction. As a consequence, P is not correct. We can now prove the main negative result. Theorem 7.8.2 Impossibility of Strong Majority Let S be adjacency-preserving, continuous and F-admissible. Then no k-agreement protocol is correct in spite of F communication faults in S for K>n/2. Proof. Assume P is a correct (n/2+1)-agreement protocol in spite of F communi- cation faults when the message system returns only events in S. In a typical bivalency approach, the proof involves two steps: First, it is argued that there is some initial configuration in which the decision is not already predetermined; second, it is shown that it is possible to forever postpone entering a configuration with a decision value. Lemma 7.8.1 P(P,S) has an initial bivalent configuration. Proof. By contradiction, let every initial configuration in P(P,S)bev-valent for = v ∈{0, 1} and let P be correct. As, by definition, there is at least a 0-valent initial configuration A and a 1-valent initial configuration B; then there must be a 0-valent initial configuration and a 1-valent initial configuration that are adjacent. In fact, let A 0 = A, and let A h denote the configuration obtained by changing into 1 a single 0 input value of A h−1 ,1≤ h ≤ z(A), where z(A) is the number of 0s in A; similarly define B h ,0≤ h ≤ z(B) where z(B) is the number of 0s in B. By construction, A z(A) = B z(B) . Consider the sequence A = A 0 ,A 1 , , A z(A) = B z(B) , B 1 ,B 0 = B. In it, each configuration is adjacent to the following one; as it starts with a 0-valent and ends with a 1-valent configuration, it contains a 0-valent configuration adjacent UBIQUITOUS FAULTS 473 to a 1-valent one. By Theorem 7.8.1 it follows that P is not correct: a contradiction. Hence, in P(P,S) there must be an initial bivalent configuration. Lemma 7.8.2 Every bivalent configuration in P(P,S) has a succeeding bivalent configuration. Proof. Let C be a bivalent configuration in P(P,S). If C has no succeeding bivalent configuration, then C has at least one 0-valent and at least one 1-valent succeeding configuration, say A and B. Let τ ,τ ∈ S such that τ (C) = A and τ (C) = B.As S is continuous, there exists a sequence τ 0 , ,τ m of events in S for l(C) such that τ 0 = τ ,τ m = τ , and τ i (C) and τ i+1 (C) are adjacent, 0 ≤ i<m. Consider now the corresponding sequence of configurations: A = τ (C) = τ 0 (C),τ 1 (C),τ 2 (C), , τ m (C) = τ (C) = B. As this sequence starts with a 0-valent and ends with a 1-valent configuration, it contains a 0-valent configuration adjacent to a 1-valent one. By Theorem 7.8.1, P is not correct: a contradiction. Hence, every bivalent configuration in P(P,S) has a succeeding bivalent configuration. From Lemmas 7.8.1 and 7.8.2, it follows that there exists an infinite sequence of accessible bivalent configurations, each derivable in one step from the preceding one. This contradicts the assumption that for each initial configuration C there exists a t ≥ 0 such that every C ∈ R t (C) has a decision value; thus, P is not correct. This concludes the proof of Theorem 7.8.2. Consequences The Impossibility of Strong Majority result provides a powerful tool for proving impossibility results for nontrivial agreement: If it can be shown that a set S of events is adjacency preserving, continuous, and F -admissible, then no nontrivial agreement is possible for the types and numbers of faults implied by S. Obviously, not every set S of events is adjacency preserving; unfortunately, all the ones we are interested in are so. A summary is shown in Figure 7.18. Omission Faults We can use the Impossibility of Strong Majority result to prove that no strong majority protocol is correct in spite of deg(G) communication faults, even when the faults are only omissions. Let Omit be the set of all events containing at most deg(G) omission faults. Thus, by definition, Omit is deg(G)-admissible. To verify that Omit is continuous, consider a configuration C and any two events τ ,τ ∈ O for ⌳(C). Let m 1 ,m 2 , ,m f be the f faulty communications in τ , and let m 1 ,m 2 , ,m f be the f faulty communications in τ .AsO is deg(G)– admissible, f ≤ deg(G) and f ≤ deg(G). Let τ 0 = τ , and let τ h denote the event obtained by replacing the faulty communication m h in τ h−1 with a nonfaulty one (with the same message sent in both), 1 ≤ h ≤ f ; Similarly define τ h ,0≤ h ≤ f . 474 COMPUTING IN PRESENCE OF FAULTS A + C = Deg(G) No Faults O = Deg(G) (Byzantine) A + C + O = Deg(G)/2 FIGURE 7.18: Impossibility. Minimum number of faults per clock cycle that may render strong majority impossible. By construction, τ f = τ f . Consider the sequence τ 0 ,τ 1 , ,τ f = τ f , ,τ 1 ,τ 0 . In this sequence, each event is adjacent to the following one; furthermore, as by construction each event contains at most deg(G) omissions, it is in Omit. Thus, Omit is continuous. We can now show that Omit is adjacency preserving. Given a message matrix ⌳; let ψ ⌳ ,l denote the event for ⌳ where all and only the messages sent by x l are lost. Then, for each ⌳ and l, ψ ⌳ ,l ∈ Omit. Let configurations A and B be l-adjacent. Consider the events ψ ⌳ (A),l and ψ ⌳(B),l for A and B, respectively, and the resulting configurations A and B . By Properties 7.8.1 and 7.8.2, it follows that also A and B are l-adjacent. Hence Omit is adjacency preserving. Summarizing, Lemma 7.8.3 Omit is deg(G)-admissible, continuous, and adjacency preserving. Then, by Theorem 7.8.1, it follows that Theorem 7.8.3 No p-agreement protocol P is correct in spite of deg(G) omission faults in Omit for p>n/2. Addition and Corruption Faults Using a similar approach, we can show that when the faults are additions and corruptions no strong majority protocol is correct in spite of deg(G) communication faults. Let AddCorr denote the set of all events containing at most deg(G) addition and corruption faults. Thus, by definition, AddCorr is deg(G)-admissible. It is not difficult to verify that AddCorr is continuous (Exercise 7.10.40). UBIQUITOUS FAULTS 475 We can prove that AddCorr is adjacency preserving as follows. For any two h- adjacent configurations A and B, consider the events π h and ρ h for ⌳(A) ={α ij }and ⌳(B) ={γ ij }, respectively where for all (x i ,x j ) ∈ E, π h [i, j] = (α ij ,γ ij ) if i = h and α ij = ⍀ (α ij ,α ij ) otherwise and ρ h [i, j] = (γ ij ,α ij ) if i = h and α ij = ⍀ (γ ij ,γ ij ) otherwise. It is not difficult to verify that π h , ρ h ∈ AddCorr and the configurations π h (C ) and ρ h (C ) are h-adjacent. Hence AddCorr is adjacency preserving. Summarizing, Lemma 7.8.4 AddCorr is deg (G)-admissible, continuous, and adjacency preserv- ing. Then, by Theorem 7.8.1, it follows that Theorem 7.8.4 No p-agreement protocol P is correct in spite of deg(G) communi- cation faults in AddCorr for p>n/2. Byzantine Faults We now show that no strong majority protocol is correct in spite of deg(G)/2 arbitrary communication faults. Let Byz be the set of all events containing at most deg(G)/2 communication faults, where the faults may be omissions, corruptions, and additions. By definition, Byz is deg(G)/2-admissible. Actually (see Exercises 7.10.41 and 7.10.42), Lemma 7.8.5 Byz is deg(G)/2-admissible, continuous, and adjacency preserv- ing. Then, by Theorem 7.8.1, it follows that Theorem 7.8.5 No p-agreement protocol P is correct in spite of deg(G)/2 com- munication faults in Byz for p>n/2. and dynamic result all if, at each 7.8.3 Unanimity in Spite of Ubiquitous Faults In this section we examine the possibility of achieving unanimity among the entities, agreement in spite ofdynamic faults. We will examine theproblem underthe following restrictions: 476 COMPUTING IN PRESENCE OF FAULTS Additional Assumptions (MA) 1. Connectivity, Bidirectional Links; 2. Synch; 3. all entities start simultaneously; 4. each entity has a map of the network. Surprisingly, unanimity can be achieved in several cases; the exact conditions depend not only on the type and number of faults but also on the edge connectivity c edge (G)ofG. In all cases, we will reach unanimity, in spite of F communication faults per clock cycle, by computing the OR of the input values and deciding on that value. This is achieved by first constructing (if not already available) a mechanism for correctly broadcasting the value of a bit within a fixed amount of time T in spite of F communication faults per clock cycle. This reliable broadcast, once constructed, is then used to correctly compute the logical OR of the input values: All entities with input value 1 will reliably broadcast their value; if at least one of the input values is 1 (thus, the result of OR is 1), then everybody will be communicated this fact within time T ; on the contrary, if all input values are 0 (thus, the result of OR is 0), there will be no broadcasts and everybody will be aware of this fact within time T . The variable T will be called timeout. The actual reliable broadcast mechanism will differ depending on the nature of the faults. Single Type Faults: Omissions Consider the case when the communication errors are just omissions. That is, in addition to MA we have the restriction Omission that the only faults are omissions. First observe that, because of Lemma 7.1.1, broadcast is impossible if F ≥ c edge (G). This means that we might be able to tolerate at most c edge (G) − 1 omissions for time unit. Let F ≤ c edge (G) − 1. When broadcasting in this situation, it is rather easy to circumvent the loss of messages. In fact, it suffices for all entities involved, start- ing from the initiator of the broadcast, to send the same message to the same neighbors for several consecutive time steps. More precisely, consider the following algorithm: Algorithm Bcast-Omit 1. Tobroadcast in G, node x sends its message at time 0 and continues transmitting it to all its neighbors until time T (G) − 1 (the actual value of the timeout T (G) will be determined later); 2. a node y receiving the message at time t<T(G) will transmit the message to all its other neighbors until time T (G) −1. UBIQUITOUS FAULTS 477 Let us verify that if F<c edge (G), there are values of the timeout T (G) for which the protocol performs the broadcast. As G has edge connectivity c edge (G), by Property 7.1.1, there are at least c edge (G) edge-disjoint paths between x and y; furthermore, each of these paths has length at most n −1. According to the protocol, x sends a message along all these c edge (G) paths. At any time instant, there are F<c edge (G) omissions; this means that at least one of these paths is free of faults. That is, at any time unit, the message from x will move one step further toward y along one of them. Since these paths have length at most n − 1, after at most c edge (G)(n − 2) +1 = c edge (G) n − 2 c edge (G) +1 time units the message from x would reach y. This means that with T (G) ≥ c edge (G) n − 2 c edge (G) + 1, it is possible to broadcast in spite of F<comissions per time units. This value for the timeout is rather high and depending on the graph G can be substantially reduced. Let us denote by T ∗ (G) the minimum timeout value ensuring algorithm Bcast-Omit to correctly perform the broadcast in G. Using algorithm Bcast-Omit to compute the OR we have the following: Theorem 7.8.6 Unanimity can be reached in spite of F = c edge (G) − 1 faults per clock cycle in time T ∗ (G) |em transmitting at most 2 m(G) T ∗ (G) bits. What is the actual value of T ∗ (G)foragivenG? We have just seen that T ∗ (G) ≤ c edge (G) n − 2c edge (G) + 1. (7.24) A different available bound (Problem 7.10.1) is T ∗ (G) = O(diam(G) c edge (G) ). (7.25) They are both estimates on how much time it takes for the broadcast to complete. Which estimate is better (i.e., smaller) depends on the graph G. For example, in a hypercube H , c edge (H ) = diam(H ) = log n; hence, if we use Equation 7.24 we have O(n log n) while with Equation 7.25 we would have a time O(n loglog n ). Actually, in a hypercube, both estimates are far from accurate. It is easy to verify (Exercise 7.10.43) that T ∗ (H ) ≤ log 2 n. It is not so simple (Exercise 7.10.44) to show that the timeout is actually T ∗ (H ) ≤ log n + 2. (7.26) In other words, with only two time units more than that in the fault-free case, broadcast can tolerate up to log n − 1 message losses per time unit. [...]... Journal of the ACM, 37(3):5 49 587, July 199 0 [ 29] M Herlihy, S Rajsbaum, and M.R Tuttle Unifying synchronous and asynchronous message-passing models In 17th ACM Symposium on Principles of Distributed Computing, pages 133–142, 199 8 [30] A Itai, S Kutten, Y Wolfstahl, and S Zaks Optimal distributed t-resilient election in complete networks IEEE Transactions on Software Engineering, 16(1):415–420, April 199 0... 43:778–787, 199 4 [34] G De Marco and A Rescigno Tighter bounds on broadcasting in torus networks in presence of dynamic faults Parallel Processing Letters, 10: 39 49, 2000 [35] G De Marco and U Vaccaro Broadcasting in hypercubes and star graphs with dynamic faults Information Processing Letters, 66:3 09 318, 199 8 [36] Y Moses and S Rajsbaum A layered analysis of consensus SIAM Journal on Computing, 31(4) :98 9–1021,... failure detector for solving consensus Journal of ACM, 43(4):685–722, 199 6 [8] T Chandra and S Toueg Unreliable failure detectors for deliable distributed systems Journal of ACM, 43(2):225–267, 199 6 [9] B.S Chlebus, K Diks, and A Pelc Broadcasting in synchronous networks with dynamic faults Networks, 27:3 09 318, 199 6 [10] F Cristian, H Aghili, R Strong, and D Dolev Atomic broadcast: From simple message... 26(4):873 93 3, 199 7 [20] M Fisher and N.A Lynch A lower bound for the time to assure interactive consistency Information Processing Letters, 14(4):183–186, 198 2 [21] M Fisher, N.A Lynch, and M Merritt Easy impossibility proofs for distributed consensus Distributed Computing, 1(1):26– 39, 198 6 [22] M.J Fisher, N.A Lynch, and M.S Paterson Impossibility of distributed consensus with one faulty process Journal of. .. Masuzawa, and N Tokura Fault-tolerant distributed algorithm in complete networks with link and processor failures IEICE Transactions on Information and Systems, J74D-I(1):12–22, Jan 199 1 [38] M Pease, R Shostak, and L Lamport Reaching agreement in the presence of faults Journal of the ACM, 27:228–234, April 198 0 [ 39] K.J Perry and S Toueg Distributed agreement in the presence of processor and communication... Computers, 37(4):4 49 453, April 198 8 [2] M.K Aguilera and S Toueg A simple bivalency proof that t-resilient consensus requires t+1 rounds Information Processing Letters, 71:155–158, 199 9 BIBLIOGRAPHY 497 [3] M Ben-Or Another advantage of free choice: Completely asynchronous agreement protocols In 2nd ACM Symposium on Principles of Distributed Computing, pages 27–30, 198 3 [4] P Berman and J.A Garay Fast... ACM, 32(2):374–382, April 198 5 [23] P Fraigniaud and C Peyrat Broadcasting in a hypercube when some calls fail Information Processing Letters, 27(1):115–1 19, April 199 1 498 COMPUTING IN PRESENCE OF FAULTS [24] E Gafni Round-by-round fault detectors:unifying synchrony and asynchrony In 17th ACM Symposium on Principles of Distributed Computing, pages 143–152, 199 8 [25] J.A Garay and Y Moses Fully polynomial... 27(1):247– 290 , 199 8 [26] J Gray Notes on data base operating systems In R.M Graham, R Bayer and G Seegmuller, editors, Operating Systems: An Advanced Course, volume 60 of LNCS, Berlin, 197 8 Springer [27] V Hadzilacos Connectivity requirements for Byzantine agreement under restricted types of failures Distributed Computing, 2 :95 –103, 198 7 [28] J.Y Halpern and Y Moses Knowledge and common knowledge in a distributed. .. networks of bounded degree Distributed Computing, 7(2):67–73, 199 3 [5] G Bracha An 0(logn) expected rounds randomized Byzantine generals protocol In 17th ACM Symposium on the Theory of Computing, pages 316–326, 198 5 [6] R Canetti and T Rabin Fast asynchronous Byzantine agreement with optimal resilience In 25th ACM Symposium on the Theory of Computing, pages 42–51, 199 3 [7] T Chandra, V Hadzilacos, and S... existence of both digital signatures and a trusted dealer can be used to implement a global source of random bits unbiased and visible to all entities EXERCISES, PROBLEMS, AND ANSWERS 493 Problem 7.10.4 Consider a set of asynchronous entities connected in a complete graph Show how the existence of both private channels and a trusted dealer can be used to implement a global source of random bits unbiased and . COMPUTING IN PRESENCE OF FAULTS The values of the registers and of the global clock, together with the program counters and the internal storage, comprise the internal state of an entity. The states in. Abu-Amara, and Hasame Abu-Amara [44]. The presence of localized failures of both links and entities (the hybrid component failure model) has been investigated by Kenneth Perry and Sam Toueg [ 39] , Vassos Hadzilacos. agreement (i.e., value of p) can be reached in spite of a certain number F of dynamic faults of a given type τ occurring at each time unit; note that, as the faults are mobile, the set of faulty communications