Graph Distances in the Streaming Model: The Value of Space

Full textClick to download.
CitationIn Proc. of the 16th Symposium on Discrete Algorithms, ACM/SIAM, pp. 745-754, 2005
AuthorsJoan Feigenbaum
Sampath Kannan
Andrew McGregor
Siddharth Suri
Jian Zhang.


We investigate the importance of space when solving problems based on graph distance in the streaming model. In this model, the input graph is presented as a stream of edges in an arbitrary order. The main computational restriction of the model is that we have limited space and therefore cannot store all the streamed data; we are forced to make space-efficient summaries of the data as we go along. For a graph of n vertices and m edges, we show that testing many graph properties, including connectivity (ergo any reasonable decision problem about distances) and bipartiteness, requires \omega(n) bits of space. Given this, we then investigate how the power of the model increases as we relax our space restriction. Our main result is an efficient randomized algorithm that constructs a (2t + 1)- spanner in one pass. With high probability, it uses O(t n^{1+1/t}log^2(n)) bits of space and processes each edge in the stream in O(t^2 n^{1/t}log(n)) time. We find approximations to diameter and girth via the constructed spanner. For t = \omega(log(n)/log(log(n))) here), the space requirement of the algorithm is O(n polylog n), and the per- edge processing time is O(polylog n). We also show a corresponding lower bound of t for the approximation ratio achievable when the space restriction is O(t n^{1+1/t}log^2 n). We then consider the scenario in which we are allowed multiple passes over the input stream. Here, we investigate whether allowing these extra passes will compensate for a given space restriction. We show that finding vertices are distance d from a particular vertex will always take d passes, for all d \in {1,..., t/2}, when the space restriction is o(n^{1+1/t}). For girth, we show the existence of a direct trade-off between space and passes in the form of a lower bound on the product of the space requirement and number of passes. Finally, we conclude with two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by at most a log factor.

Back to publications
Back to previous page