Full text | Click to download. |
Citation | PhD Thesis, Stanford University, 2004
|
Author | G. S. Manku
|
A Distributed Hash Table (DHT)
is a giant hash table that is cooperatively maintained by a large
number of machines worldwide. The machines join and leave the
system autonomously. The unprecedented scale and dynamism of the
system calls for novel design techniques which emphasize
decentralization and automatic re-configuration. Briefly, each
machine in a DHT is assigned an ID in I = [0; 1). The set of IDs
divides I into disjoint partitions, managed by one machine
each. As a function of their IDs, the machines set up connections
among themselves. These connections are used for routing messages
between the machines. The challenge lies in devising efficient
decentralized algorithms for ID management and connection
maintenance.
We propose Dipsea, a modular architecture for building DHTs,
consisting of three layers: ID Management, Overlay Routing and
Data Management. This thesis focuses on the first two layers. The
modularity of Dipsea imbues the overall system with several good
properties. A large complex problem is broken down into smaller
sub-problems, each of which can be attacked more or less
independently. This contributes to reduction in complexity: It is
possible to explore the design space of a sub-problem without
being encumbered by its interactions with other
sub-problems. Furthermore, the best solutions for individual
sub-problems can be identified and put together to arrive at an
overall design that is far more powerful than a design arrived at
by a holistic approach.
Dipsea places existing DHT designs and improvements suggested for
various DHTs into a common algorithmic framework. A significant
accomplishment is the identification of layers and modules which
are cleanly separated on the basis of functionality. Then for each
module, we devise and analyze efficient algorithms. Almost all of
the algorithms we propose are the currently best-known algorithms
for the corresponding modules.
Highlights of our contributions in ID Management and Overlay
Routing layers:
1. A simple, decentralized ID Management algorithm which is
independent of the Overlay Routing layer. The algorithm requires O(R +
log n) messages and only one reassignment of existing IDs in response
to arrival or departure of machines. R denotes the average number of
messages required by the Overlay Routing layer, and n denotes the
current number of machines in the system.
2. A generalization of the above scheme whose analysis requires the
solution to a novel Structured Coupon Collection Problem over cliques
with multiple-choices per trial.
3. The design of an Emulation Engine which can emulate arbitrary
families of deterministic and randomized routing networks. The
Emulation Engine makes the design of Dipsea a significant improvement
over existing DHT implementations, all of which are tied to specific
families of routing networks.
4. Characterization of shortest paths in Chord, a family of
deterministic routing networks that has been designed for DHTs.
5. The design of Papillon, a butterfly-based family of graphs defined
over nodes placed in a circle. Papillon supports efficient greedy
routing, in which each node forwards a message along that out-going
edge which reduces the clockwise distance to the destination by the
largest amount. In an n-node graph, Papillon routes in O(log n / log
d) hops with d links per node, which is asymptotically optimal.
6. The design of Symphony, one of the first randomized routing networks
proposed for DHT routing. With k links per node, clockwise greedy
routing takes O( 1/k log^2(n)) hops on average.
7. Tight analysis of clockwise greedy routing with/without lookahead
in several randomized routing networks including Symphony, randomized
Chord, and randomized hypercube. The idea underlying ``greedy with
lookahead'' is to allow a node to use knowledge of its neighbor's
neighbors for better routing decisions. We show that greedy routing
without lookahead requires Theta(log n) hops on average whereas greedy
with lookahead entails only Theta(log n / log log n) hops on average.
8. The design of Mariposa, an interesting combination of butter y
networks and Kleinberg's small-world construction. Mariposa also
offers routes of O(log n / log d) hops in the worst case, with d
out-going links per node.