Transcript pptx

Graph Exploration
Connected component of a graph ๐‘ฎ = (๐‘ฝ, ๐‘ฌ):
โ€ข Let ~ denote the connectedness-relation on the graph ๐บ, i.e.,
๐‘ข ~ ๐‘ฃ โ‰ก ๐‘ข is connected to ๐‘ฃ in ๐บ.
โ€ข ~ is an equivalence relation:
โ€ข ~ is reflexive: โˆ€๐‘ฃ โˆˆ ๐‘‰: ๐‘ฃ ~ ๐‘ฃ.
โ€ข ~ is symmetric: โˆ€๐‘ข, ๐‘ฃ โˆˆ ๐‘‰: ๐‘ข ~ ๐‘ฃ โ‡’ ๐‘ฃ ~ ๐‘ข.
โ€ข ~ is transitive: โˆ€๐‘ข, ๐‘ฃ, ๐‘ค โˆˆ ๐‘‰: ๐‘ข ~ ๐‘ฃ โˆง ๐‘ฃ ~ ๐‘ค โ‡’ ๐‘ข ~ ๐‘ค.
โ€ข The equivalence classes with respect to the equivalence
relation ~ are called the connected components of ๐บ.
โ€ข Alternatively: a connected component of a graph is a maximal
subsets ๐‘‹ โŠ† ๐‘‰ s.t. all pairs ๐‘ฅ, ๐‘ฆ โˆˆ ๐‘‹are connected.
2
BFS(๐บ, ๐‘ )
let ๐‘… be a queue;
T โ‰” โˆ…;
๐‘….enqueue((๐‘ , ๐‘ ));
// add an artificial self-loop to ๐‘ 
while ๐‘… โ‰  โˆ… do
(๐‘ฅ, ๐‘ฆ) โ‰” ๐‘….dequeue();
if ๐‘ฆ is unvisited then
visit ๐‘ฆ;
๐‘‡ โ‰” ๐‘‡ โˆช {(๐‘ฅ, ๐‘ฆ)};
forall ๐‘ฆ, ๐‘ง โˆˆ ๐ธ do
if ๐‘ง is unvisited then
๐‘….enqueue((๐‘ฆ, ๐‘ง));
od
fi
od
remove (๐‘ , ๐‘ ) from ๐‘‡;
// remove artificial self-loop from ๐‘‡
17,10
3,38
3,27
37,30
37,24
6,34
6,17
28,23
28,11
15,20
15,4
5,32
5,25
22,30
22,27
16,17
16,3
13,37
13,6
18,28
18,15
18,5
7,22
7,16
7,13
0,18
0,7
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
5
23,1
11,35
20,8
4,29
4,11
32,23
32,9
25,20
25,19
30,26
27,40
17,10
3,38
3,27
37,30
37,24
6,34
6,17
28,23
28,11
15,20
15,4
5,32
5,25
22,30
22,27
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
6
26,2
40,26
40,12
10,31
38,12
38,10
24,2
34,31
34,24
23,1
11,35
20,8
4,29
4,11
32,23
32,9
25,20
25,19
30,26
27,40
17,10
3,38
3,27
37,30
37,24
6,34
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
7
1,36
35,39
35,1
8,33
29,39
29,8
9,36
19,33
19,9
26,2
40,26
40,12
10,31
38,12
38,10
24,2
34,31
34,24
23,1
11,35
20,8
4,29
4,11
32,23
32,9
25,20
25,19
Breadth First Search
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
8
Breadth First Search
39,14
36,14
33,14
12,21
2,21
31,21
1,36
35,39
35,1
8,33
29,39
29,8
9,36
19,33
19,9
26,2
40,26
40,12
10,31
38,12
38,10
24,2
34,31
34,24
0
7
18
22
28
27
16
30
40
3
38
23
13
26
12
37
2
5
1
32
24
17
31
15
35
36
9
21
10
11
4
39
29
14
34
19
6
25
33
8
20
9
BFS-tree:
๐‘ 
Edges not contained in the BFS-tree can only connect successive
layers, or may connect nodes on the same layer.
The BFS-tree contains a shortest ๐‘ , ๐‘ฃ -path for every node ๐‘ฃ
reachable from ๐‘ .
Proof:
โ€ข We first show that he nodes are added to the BFS-tree ๐‘‡ in increasing
order of distance to ๐‘ .
โ€ข This means that after adding a node that obtains in ๐‘‡ a distance of say
โ„“ to the root ๐‘ , the algorithm wonโ€™t later add a node that obtains
distance โ„“โ€ฒ < โ„“.
โ€ข To see this assume for contradiction that this first happens for
nodes ๐‘ฃ and ๐‘ฃโ€ฒ.
โ€ข Let ๐‘ข and ๐‘ขโ€ฒ denote their predecessors in the tree. This means
๐‘ข is on level โ„“ โˆ’ 1 and ๐‘ขโ€ฒ is on level โ„“โ€ฒ โˆ’ 1. Further, ๐‘ฃ and ๐‘ฃโ€ฒ are
visited through edges (๐‘ข, ๐‘ฃ) and (๐‘ขโ€ฒ , ๐‘ฃโ€ฒ), respectively.
โ€ข Node ๐‘ขโ€ฒ is visited before ๐‘ข. Therefore, the edge (๐‘ขโ€ฒ, ๐‘ฃโ€ฒ) is added to
the queue before edge (๐‘ข, ๐‘ฃ). But then the edge will also be
removed earlier, and hence ๐‘ฃโ€ฒ will be visited before ๐‘ฃ. contradiction
Proof:
โ€ข Assume that ๐‘ฃ is the node closest to ๐‘  in ๐บ for which the shortest
๐‘ , ๐‘ฃ -path is not contained in ๐‘‡.
โ€ข Let ๐‘ข denote the predecessor of ๐‘ฃ in some shortest (๐‘ , ๐‘ฃ)-path, and
let ๐‘ฅ denote the predecessor of ๐‘ฃ in ๐‘‡.
โ€ข Then the shortest (๐‘ , ๐‘ข)-path will be contained in ๐‘‡. Further the
distance of ๐‘ฅ to ๐‘  in the tree will be larger than the distance of ๐‘ข to ๐‘ 
as otherwise ๐‘ฃ would have the correct distance.
โ€ข The previous observation tells us that ๐‘ข is visited before ๐‘ฅ. But then
the edge (๐‘ข, ๐‘ฃ) will be added to the queue before the edge (๐‘ฅ, ๐‘ฃ).
Hence, ๐‘ฃ will not be visited via edge ๐‘ฅ, ๐‘ฃ . A contradiction.
The only difference between DFS and BFS is that DFS uses a stack
instead of a queue:
DFS(๐บ, ๐‘ )
let ๐‘… be a stack;
T โ‰” โˆ…;
๐‘….push((๐‘ , ๐‘ ));
// add an artificial self-loop to ๐‘ 
while ๐‘… โ‰  โˆ… do
(๐‘ฅ, ๐‘ฆ) โ‰” ๐‘….pop();
if ๐‘ฆ is unvisited then
visit ๐‘ฆ;
๐‘‡ โ‰” ๐‘‡ โˆช {(๐‘ฅ, ๐‘ฆ)};
forall ๐‘ฆ, ๐‘ง โˆˆ ๐ธ do
if ๐‘ง is unvisited then
๐‘….push((๐‘ฆ, ๐‘ง));
od
fi
od
remove (๐‘ , ๐‘ ) from ๐‘‡;
// remove artificial self-loop from ๐‘‡
Depth First Search
20,15
25,20
17,16
0
38,10
19,25
33,19
10,17
8,33
6,13
25,5
3,38
17,6
40,27
29,8
8,20
4,29
3,16
27,3
7
18
22
28
35,11
12,40
40,26
4,15
11,4
26,2
39,35
21,12
12,38
35,1
39,29
14,39
31,21
21,2
34,31
14,33
36,14
31,10
27
16
30
40
3
23
13
26
37
5
11
1
32
15
35
4
24,34
9,36
36,1
34,6
37,24
9,19
32,9
24,2
38
2
24
36
9
21
30,37
23,32
37,13
32,5
10
28,23
22,30
30,26
23,1
28,11
18,28
22,27
7,22
12
17
31
39
29
14
34
19
6
25
33
8
20
18,15
7,16
18,5
0,18
7,13
0,7
14
A recursive formulation of DFS:
DFS(๐บ, ๐‘ฅ)
visit ๐‘ฅ;
forall ๐‘ฅ, ๐‘ฆ โˆˆ ๐ธ do
if ๐‘ฆ is unvisited then
DFS(๐บ, ๐‘ฆ);
๐‘‡ โ‰” ๐‘‡ โˆช {(๐‘ฅ, ๐‘ฆ)}; // ๐‘‡ is global variable
fi
od
For the two versions to be equivalent the forall-construct needs to
consider edges in the opposite order. (First choose the edge with
highest opposite end-point).
DFS-tree:
๐‘ 
Edges not contained in the DFS-tree can only connect ancestors and
descendants.
BFS/DFS running time:
โ€ข If the graph is stored as an adjacency list, we perform work ๐‘‘(๐‘ฃ)
for every node ๐‘ฃ โˆˆ ๐‘‰. (additionally we perform work ๐‘‚(๐‘›))
โ€ข In total this gives a running time of ๐‘‚(๐‘› + ๐‘š), which is optimal.
Connected Components:
โ€ข How do we find all connected components of a graph?
โ€ข Run BFS from a vertex ๐‘ , and remove the connected
component.
โ€ข Repeat until all the vertices of the graph are exhausted.
โ€ข If a component ๐ถ has ๐‘›๐‘ vertices and ๐‘š๐‘ edges, BFS or DFS
takes time ๐‘‚ ๐‘›๐‘ + ๐‘š๐‘ .
โ€ข Hence in total, we take ๐‘‚(๐‘› + ๐‘š) time.
Directed Graphs:
โ€ข The algorithms for DFS and BFS are the same.
โ€ข The properties of the BFS-tree and DFS-tree are slightly different.
โ€ข BFS-tree: the BFS-tree contains a shortest directed path from ๐‘ 
to ๐‘ฃ, for every ๐‘ฃ โˆˆ ๐‘‰.
โ€ข DFS-tree: more complicatedโ€ฆ
DFS numbering:
Compute in which order the recursive calls initiated by the nodes
in the network finish.
๐‘ง โ‰” 0; // global variable
โˆ€๐‘ฃ โˆˆ ๐‘‰: ๐‘ ๐‘ฃ โ‰” 0;
// initialize array
while โˆƒ๐‘ฃ โˆˆ ๐‘‰: ๐‘ ๐‘ฃ = 0 do DFS(๐บ, ๐‘ฃ);
DFS(๐บ, ๐‘ฅ)
visit ๐‘ฅ;
forall ๐‘ฅ, ๐‘ฆ โˆˆ ๐ธ do
if ๐‘ฆ is unvisited then
DFS(๐บ, ๐‘ฆ);
๐‘‡ โ‰” ๐‘‡ โˆช {(๐‘ฅ, ๐‘ฆ)}; // ๐‘‡ is global variable
fi
od
๐‘ ๐‘ฅ โ‰” ++๐‘ง;
Properties of DFS-numbering:
โ€ข All numbers ๐‘[๐‘ฃ], ๐‘ฃ โˆˆ ๐‘‰ are distinct.
โ€ข If ๐‘ข, ๐‘ฃ is a tree-edge with ๐‘ข being the parent of ๐‘ฃ, then
๐‘ ๐‘ข > ๐‘[๐‘ฃ].
โ€ข The DFS-search from ๐‘ข, calls DFS(๐‘ฃ), when it examines edge
๐‘ข, ๐‘ฃ . This call must first finish before the DFS for ๐‘ข can finish.
Hence, ๐‘ข receives a larger number.
โ€ข Consider a non-tree edge (๐‘ข, ๐‘ฃ). Then either ๐‘ ๐‘ข > ๐‘[๐‘ฃ] or ๐‘ข
and ๐‘ฃ have an ancestor/descendant relationship in the DFSforest.
Proof:
โ€ข Assume for contradiction that there are two nodes ๐‘ข and ๐‘ฃ that
have no ancestor relationship, and that there exists an
edge ๐‘ข, ๐‘ฃ with ๐‘ ๐‘ข < ๐‘[๐‘ฃ].
โ€ข For any point in time during the DFS-search the active nodes
form a path in the final DFS-tree from the root to some node ๐‘ง
(i.e., all these nodes have an ancestor/descendant relation).
Proof:
โ€ข The only reason that the edge (๐‘ข, ๐‘ฃ) is not part of the DFS-tree
is that ๐‘ฃ is already visited when the edge is examined.
โ€ข At this point in time ๐‘ข has not yet finished its DFS-call.
Consequently, ๐‘ฃ has not yet finished its DFS-call either as
๐‘ ๐‘ข < ๐‘[๐‘ฃ].
โ€ข But then ๐‘ฃ is active (it is visited but has not yet finished its DFScall).
โ€ข Since ๐‘ข is active, as well, this is a contradiction to the fact that ๐‘ข
and ๐‘ฃ do not have an ancestor/descendant relationship.
Strongly Connected Components:
โ€ข Compute a DFS-numbering for ๐บ.
โ€ข Compute a graph ๐บโ€ฒ obtained from ๐บ by reversing all edges.
โ€ข Start a DFS in ๐บโ€ฒ where the starting/restarting is always done
with the node ๐‘ฃ that has the maximum value of ๐‘ ๐‘ฃ among the
yet unvisited nodes.
โ€ข The trees generated by this run are the strongly connected
components of ๐บ (and also of ๐บโ€ฒ).
5
Graph ๐บโ€ฒ
8
7
4
6
2
5
8
7
3
4
3
Graph ๐บ
6
1
2
1
Proof:
โ€ข We first show that if ๐‘ข and ๐‘ฃ are strongly connected, then they
will end up in the same tree.
โ€ข Suppose ๐‘ข is the first node visited (among ๐‘ข and ๐‘ฃ), and it ends
up in some tree ๐‘‡๐‘ฅ with root ๐‘ฅ.
โ€ข ๐‘‡๐‘ฅ will contain all nodes reachable from ๐‘ฅ at the time the DFS on
๐‘ฅ started (i.e., after removing nodes already contained in
previous trees).
โ€ข There exists a path from ๐‘ข to ๐‘ฃ in ๐บโ€ฒ. For ๐‘ฃ not to be contained
in ๐‘‡๐‘ฅ there must exist nodes on this path that already belong to
other trees when DFS(๐‘ฅ) starts.
โ€ข Let ๐‘ง denote the last node for which this holds , and let ๐‘งโ€ฒ
denote its successor (why ๐‘ง โ‰  ๐‘ฃ??).
Proof:
โ€ข The edge (๐‘ง, ๐‘งโ€ฒ) at the time that DFS(๐‘ฅ) starts consists of a node
๐‘ง that already finished its DFS-call (is contained in some tree),
and a node ๐‘งโ€ฒ that did not yet start. Hence, ๐‘โ€ฒ ๐‘ง โ€ฒ > ๐‘โ€ฒ[๐‘ง] in
the end, where ๐‘โ€ฒ is the DFS-numbering obtained for graph ๐บโ€ฒ.
โ€ข This is a contradiction as ๐‘ง and ๐‘งโ€ฒ do not have an ancestor
relationship since they end up in different trees.
Claim (need later):
More generally we can say that if there is a path from ๐‘Ž to ๐‘ and
the DFS first visits ๐‘Ž, then ๐‘Ž and ๐‘ will be contained in the same
tree of the DFS.
Proof:
โ€ข Now we show that if ๐‘ข and ๐‘ฃ are two nodes in the same tree in
the DFS on ๐บโ€ฒ then they are in the same connected component
in ๐บ.
โ€ข Assume that ๐‘ข and ๐‘ฃ are in tree ๐‘‡๐‘ฅ rooted at ๐‘ฅ.
โ€ข We show that both of them are strongly connected to ๐‘ฅ. From
this the result follows.
โ€ข We only show it for ๐‘ข as the proof for ๐‘ฃ is identical.
โ€ข We know that ๐‘ข is in the DFS-tree with root ๐‘ฅ generated on ๐บโ€ฒ.
โ€ข This means there is a path from ๐‘ฅ to ๐‘ข in ๐บ โ€ฒ , which means that
there is a path from ๐‘ข to ๐‘ฅ in ๐บ.
โ€ข It remains to show that there is a path from ๐‘ฅ to ๐‘ข in ๐บ.
Proof:
โ€ข Why is ๐‘ข in the tree ๐‘‡๐‘ฅ ? For this to happen the following must be
true:
โ€ข There is no node ๐‘ค with a higher ๐‘[๐‘ค] value than ๐‘ฅ that
can reach either ๐‘ข or ๐‘ฅ in ๐บโ€ฒ. Otherwise, at the time we start
the call at ๐‘ฅ in the DFS on ๐บโ€ฒ there either exists an unvisited
node with larger ๐‘[๐‘ค] value
OR
from the existing trees there exists an outgoing edge to a yet
unvisited vertex.
Both possibilities lead to contradictions.
Proof:
โ€ข Now, suppose that in the DFS on ๐บ we start the DFS-call on ๐‘ข
before we start the call on ๐‘ฅ.
โ€ข ๐‘ข and ๐‘ฅ must end up in the same tree since there is a path from
๐‘ข to ๐‘ฅ (see previous claim).
โ€ข We know that the DFS-call for ๐‘ฅ finishes later as ๐‘ ๐‘ฅ > ๐‘[๐‘ข].
โ€ข Therefore we finish the call for ๐‘ข before we start the call on ๐‘ฅ as
DFS-calls cannot โ€œinterleaveโ€ (๐‘ข-start ๐‘ฅ-start ๐‘ข-end ๐‘ฅ-end cannot
happen).
โ€ข Therefore there exists a root ๐‘Ÿ different from ๐‘ข and ๐‘ฅ in the DFS
tree in ๐บ.
โ€ข This root will obtain a higher DFS-number than ๐‘ฅ, and can reach
both ๐‘ฅ and ๐‘ข. contradiction
Proof:
โ€ข This means in the DFS on ๐บ we start the DFS-call on ๐‘ฅ before we
start the call on ๐‘ข.
โ€ข However, in this case the search from ๐‘ฅ must visit ๐‘ข as otherwise
๐‘ข could not obtain a lower DFS-number.
โ€ข Therefore, there must exist a path from ๐‘ฅ to ๐‘ข.