The Solution:
Host i's state is Normal and Host i spontaneously decides to conduct an election Ei = 0 Ci = i Bi = Ni Sends Elect(i) to all neighbors in Bi Host i enters state conducting Host i receives an Elect(j) message from k Ei = 0 Ci = j -- host j called the election Ri = k -- host i heard about it from k Bi = (Ni - k) -- the set of outgoing branches if (Bi) is the null set, host i is a leaf Host i enters state Result and Sends Vote(i) to Ri else, host i is not a leaf Host i enters state Subtree and Sends Elect(j) to all neighbors in Bi Host i receives a Vote() message Unexpected! Host i receives a Consensus() message Unexpected! Host i's state is Conducting and Host i receives an Elect(j) message from k Collision! Host i receives a Vote(j) message from k Ei = max( Ei, j ) Bi = Bi - k if Bi is the null set, all votes are in send Consensus(Ei) to all Ni and enter state Normal Host i receives a Consensus(j) message from k Unexpected! Host i's state is Subtree and Host i receives an Elect(j) message from k Collision! Host i receives a Vote(j) message from k Ei = max( Ei, j ) Bi = Bi - k if Bi is the null set, all votes are in send Vote(Ei) to Ri and enter state Result Host i receives a Consensus(j) message from k Unexpected! Host i's state is Result and Host i receives an Elect(j) message from k Collision! Host i receives a Vote(j) message from k Unexpected! Host i receives a Consensus(j) message from k Ei = j Send Consensus(j) to all Ni - k and enter state NormalThe Problem, part B: In what states does the receipt of what messages imply that two elections have collided? These are noted above. Modify the protocol so it guarantees that only one of the two colliding elections will be completed.
A Solution (outline):
Basically, whenever a collision is detected between two elections, compare the identity of the conductors (Ci for host i's own ongoing electon, and the field j from the received Elect(j) message). If every host uses the same comparison rule to determine which election should proceed and which should stop, then the rule is simple.
When a host receives an Elect(j) message and declares itself the winner of the collision, it simply ignores the message. When a host receives such a message and declares itself the loser, it ignores the election it was participating in and acts as if its prior state was normal. All messages must be modified to include the identity of the conductor, and all incoming messages that correspond to the wrong conductor can be safely ignored.
A Solution (outline):
None given here
The only failures that concern us on this system are failures of a process. That is, we assume that memory is fully reliable. Assignment of a pointer or integer can be done as an atomic operation, at the machine level, but assignment of floating point numbers is not atomic -- it requires multiple load or store operations.
Problem: Write code to atomically increment a shared floating point variable.
One solution: Use three floating point variables, F, G and H, and assume that no failure will corrupt more than one of the three. If the first two are equal, all is well (even if the third is corrupt); if they differ and the second two are equal, the failure must have involved an update to the first. If all three differ, the failure must have involved a partial update of the middle one, so the first can be relied on to be good.
In any case, if a process fails in its critical section, it will fail while holding the semaphore, so no error checking is necessary for normal entry.
if P(mutex, generous_delay) then -- normal case temp = F else -- there was some error if F = G then temp = F -- H may be corrupt else if G=H temp = G -- F was corrupt else temp = F -- G was corrupt endif endif temp = temp + 1 F = temp G = temp H = temp V(mutex)
Another solution: Use Ptr, a pointer to a floating point variable and two variables, F and G.
P(mutex, generous_delay) if Ptr = address_of(F) then localPtr = address_of(G) else localPtr = address_of(F) -- now Ptr and localPtr point to different variables localPtr^ = Ptr^ + 1 Ptr = localPtr -- the actual update visible to the world V(mutex)