Transcript pptx
Graph Exploration
Connected component of a graph ๐ฎ = (๐ฝ, ๐ฌ):
โข Let ~ denote the connectedness-relation on the graph ๐บ, i.e.,
๐ข ~ ๐ฃ โก ๐ข is connected to ๐ฃ in ๐บ.
โข ~ is an equivalence relation:
โข ~ is reflexive: โ๐ฃ โ ๐: ๐ฃ ~ ๐ฃ.
โข ~ is symmetric: โ๐ข, ๐ฃ โ ๐: ๐ข ~ ๐ฃ โ ๐ฃ ~ ๐ข.
โข ~ is transitive: โ๐ข, ๐ฃ, ๐ค โ ๐: ๐ข ~ ๐ฃ โง ๐ฃ ~ ๐ค โ ๐ข ~ ๐ค.
โข The equivalence classes with respect to the equivalence
relation ~ are called the connected components of ๐บ.
โข Alternatively: a connected component of a graph is a maximal
subsets ๐ โ ๐ s.t. all pairs ๐ฅ, ๐ฆ โ ๐are connected.
2
BFS(๐บ, ๐ )
let ๐
be a queue;
T โ โ
;
๐
.enqueue((๐ , ๐ ));
// add an artificial self-loop to ๐
while ๐
โ โ
do
(๐ฅ, ๐ฆ) โ ๐
.dequeue();
if ๐ฆ is unvisited then
visit ๐ฆ;
๐ โ ๐ โช {(๐ฅ, ๐ฆ)};
forall ๐ฆ, ๐ง โ ๐ธ do
if ๐ง is unvisited then
๐
.enqueue((๐ฆ, ๐ง));
od
fi
od
remove (๐ , ๐ ) from ๐;
// remove artificial self-loop from ๐
17,10
3,38
3,27
37,30
37,24
6,34
6,17
28,23
28,11
15,20
15,4
5,32
5,25
22,30
22,27
16,17
16,3
13,37
13,6
18,28
18,15
18,5
7,22
7,16
7,13
0,18
0,7
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
5
23,1
11,35
20,8
4,29
4,11
32,23
32,9
25,20
25,19
30,26
27,40
17,10
3,38
3,27
37,30
37,24
6,34
6,17
28,23
28,11
15,20
15,4
5,32
5,25
22,30
22,27
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
6
26,2
40,26
40,12
10,31
38,12
38,10
24,2
34,31
34,24
23,1
11,35
20,8
4,29
4,11
32,23
32,9
25,20
25,19
30,26
27,40
17,10
3,38
3,27
37,30
37,24
6,34
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
7
1,36
35,39
35,1
8,33
29,39
29,8
9,36
19,33
19,9
26,2
40,26
40,12
10,31
38,12
38,10
24,2
34,31
34,24
23,1
11,35
20,8
4,29
4,11
32,23
32,9
25,20
25,19
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
8
Breadth First Search
39,14
36,14
33,14
12,21
2,21
31,21
1,36
35,39
35,1
8,33
29,39
29,8
9,36
19,33
19,9
26,2
40,26
40,12
10,31
38,12
38,10
24,2
34,31
34,24
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
9
BFS-tree:
๐
Edges not contained in the BFS-tree can only connect successive
layers, or may connect nodes on the same layer.
The BFS-tree contains a shortest ๐ , ๐ฃ -path for every node ๐ฃ
reachable from ๐ .
Proof:
โข We first show that he nodes are added to the BFS-tree ๐ in increasing
order of distance to ๐ .
โข This means that after adding a node that obtains in ๐ a distance of say
โ to the root ๐ , the algorithm wonโt later add a node that obtains
distance โโฒ < โ.
โข To see this assume for contradiction that this first happens for
nodes ๐ฃ and ๐ฃโฒ.
โข Let ๐ข and ๐ขโฒ denote their predecessors in the tree. This means
๐ข is on level โ โ 1 and ๐ขโฒ is on level โโฒ โ 1. Further, ๐ฃ and ๐ฃโฒ are
visited through edges (๐ข, ๐ฃ) and (๐ขโฒ , ๐ฃโฒ), respectively.
โข Node ๐ขโฒ is visited before ๐ข. Therefore, the edge (๐ขโฒ, ๐ฃโฒ) is added to
the queue before edge (๐ข, ๐ฃ). But then the edge will also be
removed earlier, and hence ๐ฃโฒ will be visited before ๐ฃ. contradiction
Proof:
โข Assume that ๐ฃ is the node closest to ๐ in ๐บ for which the shortest
๐ , ๐ฃ -path is not contained in ๐.
โข Let ๐ข denote the predecessor of ๐ฃ in some shortest (๐ , ๐ฃ)-path, and
let ๐ฅ denote the predecessor of ๐ฃ in ๐.
โข Then the shortest (๐ , ๐ข)-path will be contained in ๐. Further the
distance of ๐ฅ to ๐ in the tree will be larger than the distance of ๐ข to ๐
as otherwise ๐ฃ would have the correct distance.
โข The previous observation tells us that ๐ข is visited before ๐ฅ. But then
the edge (๐ข, ๐ฃ) will be added to the queue before the edge (๐ฅ, ๐ฃ).
Hence, ๐ฃ will not be visited via edge ๐ฅ, ๐ฃ . A contradiction.
The only difference between DFS and BFS is that DFS uses a stack
instead of a queue:
DFS(๐บ, ๐ )
let ๐
be a stack;
T โ โ
;
๐
.push((๐ , ๐ ));
// add an artificial self-loop to ๐
while ๐
โ โ
do
(๐ฅ, ๐ฆ) โ ๐
.pop();
if ๐ฆ is unvisited then
visit ๐ฆ;
๐ โ ๐ โช {(๐ฅ, ๐ฆ)};
forall ๐ฆ, ๐ง โ ๐ธ do
if ๐ง is unvisited then
๐
.push((๐ฆ, ๐ง));
od
fi
od
remove (๐ , ๐ ) from ๐;
// remove artificial self-loop from ๐
Depth First Search
20,15
25,20
17,16
0
38,10
19,25
33,19
10,17
8,33
6,13
25,5
3,38
17,6
40,27
29,8
8,20
4,29
3,16
27,3
7
18
22
28
35,11
12,40
40,26
4,15
11,4
26,2
39,35
21,12
12,38
35,1
39,29
14,39
31,21
21,2
34,31
14,33
36,14
31,10
27
16
30
40
3
23
13
26
37
5
11
1
32
15
35
4
24,34
9,36
36,1
34,6
37,24
9,19
32,9
24,2
38
2
24
36
9
21
30,37
23,32
37,13
32,5
10
28,23
22,30
30,26
23,1
28,11
18,28
22,27
7,22
12
17
31
39
29
14
34
19
6
25
33
8
20
18,15
7,16
18,5
0,18
7,13
0,7
14
A recursive formulation of DFS:
DFS(๐บ, ๐ฅ)
visit ๐ฅ;
forall ๐ฅ, ๐ฆ โ ๐ธ do
if ๐ฆ is unvisited then
DFS(๐บ, ๐ฆ);
๐ โ ๐ โช {(๐ฅ, ๐ฆ)}; // ๐ is global variable
fi
od
For the two versions to be equivalent the forall-construct needs to
consider edges in the opposite order. (First choose the edge with
highest opposite end-point).
DFS-tree:
๐
Edges not contained in the DFS-tree can only connect ancestors and
descendants.
BFS/DFS running time:
โข If the graph is stored as an adjacency list, we perform work ๐(๐ฃ)
for every node ๐ฃ โ ๐. (additionally we perform work ๐(๐))
โข In total this gives a running time of ๐(๐ + ๐), which is optimal.
Connected Components:
โข How do we find all connected components of a graph?
โข Run BFS from a vertex ๐ , and remove the connected
component.
โข Repeat until all the vertices of the graph are exhausted.
โข If a component ๐ถ has ๐๐ vertices and ๐๐ edges, BFS or DFS
takes time ๐ ๐๐ + ๐๐ .
โข Hence in total, we take ๐(๐ + ๐) time.
Directed Graphs:
โข The algorithms for DFS and BFS are the same.
โข The properties of the BFS-tree and DFS-tree are slightly different.
โข BFS-tree: the BFS-tree contains a shortest directed path from ๐
to ๐ฃ, for every ๐ฃ โ ๐.
โข DFS-tree: more complicatedโฆ
DFS numbering:
Compute in which order the recursive calls initiated by the nodes
in the network finish.
๐ง โ 0; // global variable
โ๐ฃ โ ๐: ๐ ๐ฃ โ 0;
// initialize array
while โ๐ฃ โ ๐: ๐ ๐ฃ = 0 do DFS(๐บ, ๐ฃ);
DFS(๐บ, ๐ฅ)
visit ๐ฅ;
forall ๐ฅ, ๐ฆ โ ๐ธ do
if ๐ฆ is unvisited then
DFS(๐บ, ๐ฆ);
๐ โ ๐ โช {(๐ฅ, ๐ฆ)}; // ๐ is global variable
fi
od
๐ ๐ฅ โ ++๐ง;
Properties of DFS-numbering:
โข All numbers ๐[๐ฃ], ๐ฃ โ ๐ are distinct.
โข If ๐ข, ๐ฃ is a tree-edge with ๐ข being the parent of ๐ฃ, then
๐ ๐ข > ๐[๐ฃ].
โข The DFS-search from ๐ข, calls DFS(๐ฃ), when it examines edge
๐ข, ๐ฃ . This call must first finish before the DFS for ๐ข can finish.
Hence, ๐ข receives a larger number.
โข Consider a non-tree edge (๐ข, ๐ฃ). Then either ๐ ๐ข > ๐[๐ฃ] or ๐ข
and ๐ฃ have an ancestor/descendant relationship in the DFSforest.
Proof:
โข Assume for contradiction that there are two nodes ๐ข and ๐ฃ that
have no ancestor relationship, and that there exists an
edge ๐ข, ๐ฃ with ๐ ๐ข < ๐[๐ฃ].
โข For any point in time during the DFS-search the active nodes
form a path in the final DFS-tree from the root to some node ๐ง
(i.e., all these nodes have an ancestor/descendant relation).
Proof:
โข The only reason that the edge (๐ข, ๐ฃ) is not part of the DFS-tree
is that ๐ฃ is already visited when the edge is examined.
โข At this point in time ๐ข has not yet finished its DFS-call.
Consequently, ๐ฃ has not yet finished its DFS-call either as
๐ ๐ข < ๐[๐ฃ].
โข But then ๐ฃ is active (it is visited but has not yet finished its DFScall).
โข Since ๐ข is active, as well, this is a contradiction to the fact that ๐ข
and ๐ฃ do not have an ancestor/descendant relationship.
Strongly Connected Components:
โข Compute a DFS-numbering for ๐บ.
โข Compute a graph ๐บโฒ obtained from ๐บ by reversing all edges.
โข Start a DFS in ๐บโฒ where the starting/restarting is always done
with the node ๐ฃ that has the maximum value of ๐ ๐ฃ among the
yet unvisited nodes.
โข The trees generated by this run are the strongly connected
components of ๐บ (and also of ๐บโฒ).
5
Graph ๐บโฒ
8
7
4
6
2
5
8
7
3
4
3
Graph ๐บ
6
1
2
1
Proof:
โข We first show that if ๐ข and ๐ฃ are strongly connected, then they
will end up in the same tree.
โข Suppose ๐ข is the first node visited (among ๐ข and ๐ฃ), and it ends
up in some tree ๐๐ฅ with root ๐ฅ.
โข ๐๐ฅ will contain all nodes reachable from ๐ฅ at the time the DFS on
๐ฅ started (i.e., after removing nodes already contained in
previous trees).
โข There exists a path from ๐ข to ๐ฃ in ๐บโฒ. For ๐ฃ not to be contained
in ๐๐ฅ there must exist nodes on this path that already belong to
other trees when DFS(๐ฅ) starts.
โข Let ๐ง denote the last node for which this holds , and let ๐งโฒ
denote its successor (why ๐ง โ ๐ฃ??).
Proof:
โข The edge (๐ง, ๐งโฒ) at the time that DFS(๐ฅ) starts consists of a node
๐ง that already finished its DFS-call (is contained in some tree),
and a node ๐งโฒ that did not yet start. Hence, ๐โฒ ๐ง โฒ > ๐โฒ[๐ง] in
the end, where ๐โฒ is the DFS-numbering obtained for graph ๐บโฒ.
โข This is a contradiction as ๐ง and ๐งโฒ do not have an ancestor
relationship since they end up in different trees.
Claim (need later):
More generally we can say that if there is a path from ๐ to ๐ and
the DFS first visits ๐, then ๐ and ๐ will be contained in the same
tree of the DFS.
Proof:
โข Now we show that if ๐ข and ๐ฃ are two nodes in the same tree in
the DFS on ๐บโฒ then they are in the same connected component
in ๐บ.
โข Assume that ๐ข and ๐ฃ are in tree ๐๐ฅ rooted at ๐ฅ.
โข We show that both of them are strongly connected to ๐ฅ. From
this the result follows.
โข We only show it for ๐ข as the proof for ๐ฃ is identical.
โข We know that ๐ข is in the DFS-tree with root ๐ฅ generated on ๐บโฒ.
โข This means there is a path from ๐ฅ to ๐ข in ๐บ โฒ , which means that
there is a path from ๐ข to ๐ฅ in ๐บ.
โข It remains to show that there is a path from ๐ฅ to ๐ข in ๐บ.
Proof:
โข Why is ๐ข in the tree ๐๐ฅ ? For this to happen the following must be
true:
โข There is no node ๐ค with a higher ๐[๐ค] value than ๐ฅ that
can reach either ๐ข or ๐ฅ in ๐บโฒ. Otherwise, at the time we start
the call at ๐ฅ in the DFS on ๐บโฒ there either exists an unvisited
node with larger ๐[๐ค] value
OR
from the existing trees there exists an outgoing edge to a yet
unvisited vertex.
Both possibilities lead to contradictions.
Proof:
โข Now, suppose that in the DFS on ๐บ we start the DFS-call on ๐ข
before we start the call on ๐ฅ.
โข ๐ข and ๐ฅ must end up in the same tree since there is a path from
๐ข to ๐ฅ (see previous claim).
โข We know that the DFS-call for ๐ฅ finishes later as ๐ ๐ฅ > ๐[๐ข].
โข Therefore we finish the call for ๐ข before we start the call on ๐ฅ as
DFS-calls cannot โinterleaveโ (๐ข-start ๐ฅ-start ๐ข-end ๐ฅ-end cannot
happen).
โข Therefore there exists a root ๐ different from ๐ข and ๐ฅ in the DFS
tree in ๐บ.
โข This root will obtain a higher DFS-number than ๐ฅ, and can reach
both ๐ฅ and ๐ข. contradiction
Proof:
โข This means in the DFS on ๐บ we start the DFS-call on ๐ฅ before we
start the call on ๐ข.
โข However, in this case the search from ๐ฅ must visit ๐ข as otherwise
๐ข could not obtain a lower DFS-number.
โข Therefore, there must exist a path from ๐ฅ to ๐ข.