Full text | Click to download. |
Citation | Ph.D. Dissertation, Yale University, 2007.
|
Author | Hong Jiang
|
Based on earlier research on population protocols, we establish a theoretical
model to study distributed systems of simple anonymous interacting agents. We
define global fairness to model nondeterministic scheduling in loosely-coupled
distributed systems and show that global fairness is a realistic assumption
and useful abstraction in many circumstances.
Fault-tolerance is a critical requirement for most tasks in the systems we
consider. We study fault-tolerance in this model from two aspects:
self-stabilization and distributed consensus.
Self-stabilization is a way to design algorithms that tolerate arbitrary
transient faults. In the type of systems we study, most of the known
algorithms and even problem definitions are no longer applicable. We give the
appropriate definitions and present self-stabilizing algorithms for a number
of important problems including token circulation, distance-$2$ coloring, ring
orientation, spanning-tree construction, and leader election, with some
constraints on the underlying interaction graph or external inputs.
To achieve constant-space uniform self-stabilizing leader election in rings
under the assumption of global fairness, we introduce the failure detector
$\Omega?$ to provide the agents with limited eventually-correct global
information concerning the presence or absence of leader(s). $\Omega?$ can be
implemented in practice with simple mechanisms such as timeouts. On the other
hand, uniform self-stabilizing leader election is impossible, under local
fairness, even with the help of $\Omega?$.
It is known that consensus is impossible in the presence of even one crash
failure in an asynchronous system. We define a nonterminating version of the consensus problem
appropriate to our model of computation called stabilizing consensus which allows the inputs and outputs to change over time and
requires the outputs of nonfaulty agents to eventually reach agreement if the inputs of nonfaulty agents stabilize. We present
stabilizing-consensus algorithms in the presence of both crash failures and
Byzantine failures.
To tolerate Byzantine failures, it is necessary to augment agents with partial identifiers, and the number of Byzantine faults must be less than
one-third of the total number of distinct identifiers.