Transcript Document

Distributed Routing Algorithms



In a message passing distributed system,
message passing is the only means of
interprocessor communication.
Unicast, Multicast, Broadcast
Communication latency in a distributed
system depends on the following factors:




Topology
Routing
Flow control
Switching
Topology


Network topology can be classified as
general purpose and special purpose.
A general purpose network does not have a
uniform and structured formation while a
special purpose network follows a
predefined structure.
Switching





store-and-forward that includes packet switching
cut-through that includes circuit switching, virtual cut-through, and
wormhole.
Store-and-forward switching: a message is divided into packets that
can be sent to a destination via different paths. When a packet reaches
an intermediate node, the entire packet is then forwarded to the next
node.
Circuit switching: a physical circuit is constructed before the
transmission. After the packet reaches the destination, the path is
destroyed.
Virtual cut-through switching: the packet is stored at the intermediate
node only if the required channel is busy; otherwise, it is forwarded
immediately without buffering.



Wormhole differs from virtual cut-through in two
aspects:
(1) Each packet is further divided into a number
of flits.
(2) When the required channel is busy, instead
of buffering the remaining flits by removing them
from the network channels, the flow control
blocks the trailing flits and they stay in flit buffers
along the established route.



At the system level, the main difference between
store-and-forward and cut-through is that the
former is sensitive to the length of the selected
path while the latter, especially in wormhole
routing with pipelined flits, is almost insensitive
to path length in the absence of network
congestion. That is, one unicasting to any
destination is considered one step.
The objective of using the store-and-forward
model is to minimize the path length.
The objective of using the cut-through model is to
Type of communication


Unicast, Multicast, Broadcast.
Personalized: a source sends different
messages to different destinations.
Routing








Routing algorithms can be classified as :

Special purpose vs. general purpose

Minimal vs. nonminimal

Deterministic vs. adaptive

Source routing vs. destination routing

Fault-tolerant vs. non fault-tolerant

Redundant vs. non redundant

Deadlock-free vs. non deadlock-free
General vs. Special Purpose

General purpose algorithms are suitable for
all types of networks but may not be
efficient for a particular network. Specialpurpose algorithms are usually efficient by
taking advantage of the topological
properties of specific networks.
Minimal vs. Nonminimal

Minimal-path algorithms provide a least
cost path between source and destination.
This scheme can lead to congestion in parts
of a network. A nonminimal routing
scheme may route the message along a
longer path to avoid network congestion.
Deterministic vs. Adaptive

In a deterministic algorithm the routing
path changes only in response to
topological changes in the underlying
network and does not use any information
regarding the state of the network. In a
dynamic algorithm the routing path
changes based on the traffic in the network.
Fault-tolerant vs. non Faulttolerant

In a fault-tolerant routing a routing
message is guaranteed to be delivered in
the presence of faults. In a non faulttolerant routing it is assumed that no fault
may occur, and hence, there is no need for
the routing algorithm to dynamically adjust
its activities.
Redundant vs. non Redundant

A typical routing algorithm is nonredundant, i.e.,
for each destination one copy of the message is
forwarded. In certain cases a shared path is used
to forward the routing message to several
destinations. For the purpose of fault tolerance,
multiple copies are set to a destination via
multiple edge-disjoint paths. As long as one of
these paths remains healthy at least one copy will
successfully reach its destination. Each
destination should make sure only one copy is
accepted.
Deadlock-free vs. non Deadlockfree

A deadlock-free routing ensures freedom
from deadlock through carefully designed
routing algorithms. In a non deadlock-free
routing no special provision is given to
prevent or avoid the occurrence of a
deadlock.
Routing functions




The routing function defines how a message is routed from the source
node to the destination node.
Destination-dependent This routing function depends on the current
and destination nodes only.
Input-dependent This routing function depends on the current and
destination nodes and the adjacent link (or node) from which a
message is received.
Source-dependent This routing function depends on the source,
current, and destination nodes.
Path-dependent This routing function depends on the destination
node the routing path from the source node to the current node.
Dijkstra’s centralized algorithm


Let D(v) be the distance (sum of link
weights along a given path) from source s
to node v. Let l(v,w) be the given cost
between nodes v and w.
There are two parts to the algorithm: An
initialization step and a step to be repeated
until the algorithm terminates.



1
Initialization. Set N={s}. For each node v
not in N, set D(v)=l(s,v). We use ∞ for nodes not
connected to s. Any number larger than the
maximum cost or distance in the network will
suffice.
2 At each subsequent step. Find a node w not
in N for which D(w) is a minimum and add w to
N. Then update D(v) for all nodes remaining that
are not in N by computing:
D(v)= min[D(v), D(w)+l(w,v)]
Step 2 is repeated until all nodes are in N.
Ford’s distributed algorithm

Each node v has the label (n,D(v)) where D(v) represents the current
value of the shortest distance from the node to the destination and n is
the next node along with the currently computed shortest path.

1 Initialization. With node d being the destination node, set
D(d)=0 and label all other nodes (., ∞).

2 Shortest-distance labeling of all nodes. For each node v<>d do
the following: Update D(v) using the current value D(w) for each
neighboring node w to calculate D(w)+l(w,v) and perform the
following update:
D(v)=min{D(v), D(w)+l(w,v)}
An example
P2
1
4
P4
3
P1
2
5
P3
20
2
P5
Dijkstra’s centralized algorithm
Round N
D(1) D(2) D(3) D(4)
Initial
{P5}


20
2
1
{P5,P4}

3
4
2
2
{P5,P4,P2}
7
3
4
2
3
{P5,P4,P2,P3}
7
3
4
2
4
{P5,P4,P2,P3,P1}
7
3
4
2
Ford’s distributed algorithm
Round P1
P2
P3
P4
Initial
(., )
(., )
(., )
(., )
1
(., )
(., )
(P5,20)
(P5,2)
2
(P3,25)
(P4,3) (P4,4)
(P5,2)
3
(P2,7)
(P4,3) (P4,4)
(P5,2)
Unicasting in Special-Purpose
Networks

The routing algorithms in the previous
section are general and are suitable for all
types of network topologies. However, they
may not be efficient for special-purpose
networks such as rings, meshes, and
hypercubes.
Bidirectional rings


Deterministic unicasting on a bidirectional ring is simple: a message
is forwarded along one direction (clockwise or counterclockwise)
depending on the position of the destination.
In multiple-path routing two paths can be used: one along the
clockwise direction and the other counterclockwise direction. Two
copies of the routing message are sent, one to each direction; or the
message is halved and each half is forwarded to a different direction.
Meshes

Adaptive routing and XY routing in 2-d
mesh
Hypercubes



The length of the shortest path between two nodes u and w is the
Hamming distance between u and w denoted as H(u,w).
The number of shortest node-disjoint paths equals the Hamming
distance between the source and destination nodes. If the selection
follows a predefined order, the routing is deterministic and is called ecube routing.
The multiple-path routing in hypercubes is based on the following
property: If two nodes s and d are separated by k-hamming-distance
in an n-cube, there are n node-disjoint paths between nodes s and d.
Out of these n paths k have a length of k and the remaining n-k have a
length of k+2.
An example
3 node-disjoint paths between 000 and 110:
111
110
d
101
Path 1: 000->100->110
Path 2: 000->010->110
Path 3: 000->001->011->111->110
100
010
011
000<-> 100
000 s
001
Path 1: 000->100
Path 2: 000->001->101->100
Path 3: 000->010->110->100
Broadcasting in Special-Purpose
Networks - Rings

Broadcasting in rings is: two copies of a message are sent
from both directions and they terminate at the two
furthermost nodes, respectively. The total number of steps
is half of the number of nodes.


One-port model: a node can only forward a
copy of the message to one of its neighbors
in one step.
All-port model: a node can forward a copy
of the message to all its neighbors in one
step.
Contention-free broadcasting in a
wormhole-routed ring: one port

For the one-port model, the best strategy is: the source s
sends the message to the furthermost node in the first step.
Partition the ring into two equal halves with one node that
has a copy of the message in each half. The above process
is repeated until all the nodes have a copy. The total
number of steps is log n.
3
3
2
1
3
2
3
Contention-free broadcasting in a
wormhole-routed ring: all-port

For the all-port model, using the cut-through model, the
source can send the message to two nodes that are n/3
distance away where n is the total number of nodes. In the
next step each of three nodes sends the message to two
nodes that are n/6 distance away. In general, after k steps
3^k nodes have a copy and each sends the message to two
nodes that are n/3(k+1) distance away. Basically, this
approach cuts a path into three subpaths of equal length
with the center node of each subpath as the only node
with a copy of the routing message.
2
1
2
1
2
2
2
2
Broadcasting in a wormholerouted mesh: one-port
S
1
2
2
A broadcast with messagepartition in 2-d meshes
S
Personalized broadcast of
¼ message in one row
Broadcast of ¼ message in
columns
Collecting four ¼ messages
in each row.
Hypercubes
110
111
110
111
3
100
2
101
100
101
3
3
1
010
010
011
011
2
3
000
001
A broadcasting initiated from 000.
000
001
A Hamiltonian cycle in a 3-cube.
Path-based Approach
0
7
8
15
1
6
9
14
2
5
10
13
3
4
11
12
A multicast in a 4x4 mesh
Low-channel
High-channel
U-mesh algorithm
Source: (0,0) Destinations: (1,0), (1,1), (1,2), (1,3), (2,0), (2,1), and (3,2)
The lexicographical order of destinations and source is:
(0,0), (1,0), (1,1), (1,2), (1,3), (2,0), (2,1), (3,2)
2
2
3
3
{(0,0), (1,0), (1,1), (1,2)} and {(1,3), (2,0), (2,1), (3,2)}
3
3
1
Virtual Channels
1
2
4
5
7
8
3
6
9
1
2
4
5
7
8
3
6
9
Virtual Channels
1
2
4
5
7
8
3
6
9
Positive network
1
2
4
5
7
8
3
6
9
Negative network
Unidirection ring
P2
Ch3
Cl3
P3
Ch3
P2
P1
Ch2
Cl2
Cl3
P3
Cl2
P1
Cl0
Cl1
Cl0
Ch0
P0
Ch2
Ch1
P0
Ch0
Cl1
Ch1
Unidirection ring algorithm


If the source address is larger than the destination
address, any channel can be used to start with;
however, once a high (or low) channel is selected,
the remaining steps should use high (or low)
channels exclusively.
If the source address is smaller than the
destination, high channels are used and high
virtual channels are switched to low virtual
channels after crossing node P3.
Turn model
Deadlock
Four turns allowed in XY-routing
Six turns allowed in positive-first
routing
Six turns allowed in negativefirst routing
Adaptivity of positive-first
routing
Y
Y
d
s
s
d
X
Fully adaptive
deterministic
X