Finding Strongly Connected Components and Topological O(log² n) reachability queries

Download Report

Transcript Finding Strongly Connected Components and Topological O(log² n) reachability queries

Finding Strongly Connected
Components and Topological
Sort in Parallel using O(log² n)
reachability queries
Warren Schudy
Brown University
Work done while interning at
Google Research Mountain View
A Scheduling Problem
• To-do List
– Topological sort (TS)
– Strongly connected
components (SCC)
– Reachability
•  SCC application in
scientific computing
requiring parallelism
[McLendon et al. 01]
Previous Results for TS, SCC and
Reachability
• Assume sparse graph with n vertices using 1 ≤ p ≤ n4/3
processors:
Runtime
(ignore logs)
Problems
Coppersmith and
Winograd ‘87
T. Spencer ‘97
Ullman &
Yannakakis ‘91
n2.38…/p
n/p1/3
n/p1/2
All
All
Reachability
(In Transactions of Information Processing Society of Japan ’99 ’04, Akio,
Masahiro and Ryozo claim runtime n/p for TS & SCC)
• Question: is reachability fundamentally easier to
parallelize than SCC and TS?
Answer: no (up to logs)
• Our main result: a reduction of SCC and TS
to O(log2 n) reachability queries
This work
Coppersmith et al. Spencer Ullman et al.
Runtime
n/p1/2
n2.38/p
Problems
TS and SCC All
n/p1/3
n/p1/2
All
Reachability
• Remainder of talk focuses on SCC problem
A simple SCC algorithm
• Choose random vertex
s V
• Determine SCC(s) and
output it
• Determine the vertices
Desc(s) reachable from s
• Recurse (in parallel) on:
– Desc(s) \ SCC(s)
– V \ Desc(s)
SCC(s)
s
Desc(s)
(Similar to [Coppersmith, Fleischer, Hendrickson and Pinar ’05])
V \ Desc(s)
High-runtime instance
s
Desc(s)
Algorithmic Idea
• If this algorithm divided the graph roughly
in two, recursion depth would be log n
• So pick many source vertices instead of 1
(number chosen later)
Desc( )
Outputting SCCs
• Make one pivot vertex s special, and
output its SCC
SCC(s)
Recurse on blue,
green, yellow and
unreached
subgraphs
Desc( )
Desc(s)
s
Our Multipivot Algorithm
• Permute the vertices
randomly
• Determine the smallest s
s.t. {1,2,…s} together
reach at least half the
edges (binary search)
• Output SCC(s)
• Recurse on:
– V \ Desc(1…s)
– (Desc(1…s-1)Desc(s)) \
SCC(s)
– Desc(1…s-1) \ Desc(s)
– Desc(s) \ (Desc(1…s1)SCC(s))
SCC(s)
s= 4
3
1
2
Desc(1…s-1)
Desc(s)
Runtime Analysis
SCC(s)
3
s= 4
k
1
2
Desc(1…s-1)
Desc(s)
k2
Each contains
less than half
the edges by
definition of s
May contain almost all the edges,
but will contain only some of the
transitive closure edges (due to
random order)
Key Lemma 9
• Number of edges in transitive closure of
Desc(s) \ (Desc(1…s-1)SCC(s)) is at
most 3/4 that of the parent subgraph V
• Correction for proof of Lemma 10:
"vertices strictly between g(v) and v"
other than v
after
Open question
• Are there better parallel algorithms for
reachability?
• E.g. can reachability on a 3-regular
digraph be computed in o(√n) time using n
processors? Ullman & Yannakakis takes
O~(√n) time.
Acknowledgements & Questions
•
•
•
•
D. Sivakumar
Claire Mathieu
Maurice Herlihy
Glencora Borradaile
**Extra slides**
Combining Reachability Queries on
subgraphs
Sources
Subgraph 1
Subgraph 2
Super-source