Full text | Click to download. |
Citation | IEEE Symposium on the Foundations of Computer Science (FOCS 04),
October 2004
|
Authors | Gagan Aggarwal
Mayur Datar Sridhar Rajagopalan Matthias Ruhl |
The need to deal with massive data sets in many practical applications
has led to a growing interest in computational models appropriate for
large inputs. The most important quality of a realistic model is that
it can be efficiently implemented across a wide range of platforms and
operating systems.
In this paper, we study the computational model that results if the
streaming model is augmented with a sorting primitive. We argue that
this model is highly practical, and that a wide range of important
problems can be efficiently solved in this (relatively weak) model.
Examples are undirected connectivity, minimum spanning trees, and
red-blue line segment intersection, among others. This suggests that
using more powerful, harder to implement models may not always be
justified.
Our main technical contribution is to show a hardness result for the
"streaming and sorting" model, which demonstrates that the main
limitation of this model is that it can only access one data stream at
a time. Since our model is strong enough to solve "pointer chasing"
problems, the communication complexity based techniques commonly used
in showing lower bounds for the streaming model cannot be adapted to
our model. We therefore have to employ new techniques to obtain these
results.
Finally, we compare our model to a popular restriction of external
memory algorithms that access their data mostly sequentially.