|Full text||Click to download.|
|Citation||Ph.D. Dissertation, Yale University, 2007.
Based on earlier research on population protocols, we establish a theoretical
model to study distributed systems of simple anonymous interacting agents. We
define global fairness to model nondeterministic scheduling in loosely-coupled
distributed systems and show that global fairness is a realistic assumption
and useful abstraction in many circumstances.
Fault-tolerance is a critical requirement for most tasks in the systems we consider. We study fault-tolerance in this model from two aspects: self-stabilization and distributed consensus.
Self-stabilization is a way to design algorithms that tolerate arbitrary transient faults. In the type of systems we study, most of the known algorithms and even problem definitions are no longer applicable. We give the appropriate definitions and present self-stabilizing algorithms for a number of important problems including token circulation, distance-$2$ coloring, ring orientation, spanning-tree construction, and leader election, with some constraints on the underlying interaction graph or external inputs.
To achieve constant-space uniform self-stabilizing leader election in rings under the assumption of global fairness, we introduce the failure detector $\Omega?$ to provide the agents with limited eventually-correct global information concerning the presence or absence of leader(s). $\Omega?$ can be implemented in practice with simple mechanisms such as timeouts. On the other hand, uniform self-stabilizing leader election is impossible, under local fairness, even with the help of $\Omega?$.
It is known that consensus is impossible in the presence of even one crash failure in an asynchronous system. We define a nonterminating version of the consensus problem appropriate to our model of computation called stabilizing consensus which allows the inputs and outputs to change over time and requires the outputs of nonfaulty agents to eventually reach agreement if the inputs of nonfaulty agents stabilize. We present stabilizing-consensus algorithms in the presence of both crash failures and Byzantine failures. To tolerate Byzantine failures, it is necessary to augment agents with partial identifiers, and the number of Byzantine faults must be less than one-third of the total number of distinct identifiers.
Back to publications
Back to previous page