CS 386C Synchronous Processor Group Membership Protocol  A. Mok 2014 Synchronous Processor Group Membership Protocol In a distributed system, disagreement on group membership can cause.

Download Report

Transcript CS 386C Synchronous Processor Group Membership Protocol  A. Mok 2014 Synchronous Processor Group Membership Protocol In a distributed system, disagreement on group membership can cause.

CS 386C
Synchronous Processor Group
Membership Protocol
 A. Mok 2014
Synchronous Processor Group
Membership Protocol
In a distributed system, disagreement on group
membership can cause serious problems.
●
Compromise data integrity
Synchronous Processor Group
Membership Protocol (cntd)
●
Loss of service availability
Correct criteria for a solution
●
●
All correct processors must agree on the
system membership
System membership information must be up-todate
System assumptions
1) Processors and network may suffer performance
failures.
d - datagram timeout
η - scheduling delay
2) Processors are synchronized.
ε - synchronization error
3) Processors can communicate by synchronous atomic
broadcast.
D - processor-to-processor atomic broadcast delay
δ = η+d+η process-to-process datagram delay
Δ =η+D+η processor-to-processor atomic broadcast delay
System assumptions (cntd)
Abstract specification assuming global time and
knowledge
Precise Realistic Specification
●
●
●
●
P - set of processors
Correct(p,t): P → { T, F }; true if p is operational
at time t.
Joined(p,t): P → { T, F }; true if p is joined to a
group at time t.
Members(p,t): P → subset of P; p's view of
membership at time t.
Safety requirement
●
●
●
(S1) If Joined(p,t) then p ε Members(p,t).
(S2) If Joined(p,t) ^ Joined(q,t) then
Members(p,t) = Members(q,t).
(S3) If Joined(p,t) ^ for all t': t≤t’≤t" Correct(p,t')
then Joined(p,t").
Liveness requirement
●
●
(L1) Correct processors must be allowed to join
in J time units, i.e., If ¬Joined(p,t) and for every
t': t≤t'≤t+J Correct(p,t') then Joined(p,t+J).
J - join latency
(L2) Incorrect processors must be detected and
new group formed in D time units, i.e.,If
Joined(p,t) and ¬Correct(p,t) then for all t':
t≤t'≤t+D Correct(q,t') → ¬(p ε Members(q,t+D)).
D – reconfiguration latency
Solutions
We shall discuss 3 protocols:
●
●
●
Periodic broadcast
Attendance list
Neighbor surveillance
Solution Protocol Stack
Implementation assumption
1) Incorrect processors must take at least R time units to
restart.
2) A new group is formed (with unique group Id) when a
processor joins or departs from a group.
●
●
Assumption (I1) can be enforced by having a restarted
processor to wait at least R time units before initiating
a Join.
We shall use the time at which a new group request
(Join) is received to be the Id of the new group. A new
group will be formed (committed) an atomic broadcast
after the Join request
Join Implementation
Processor joins are handled identically by all 3 protocols
New group Id is
commit time V
Atomic broadcast ensures agreement on the commit
time V and Members(p,V).
J = 2Δ
Periodic broadcast protocol:
processor departure handling
●
After joining at V, each member agrees to broadcast
“Present” message at time V+π, V+2π, ....
π – check-in-period
●
Atomicity assures at V+π+Δ all will agree that f is gone
●
Short reconfiguration latency: D=π+Δ
●
Significant message overhead
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
13)
14)
15)
16)
task Membership;
var group: Time; members: set-of-P initially {}
joined: Boolean initially false;
broadcast(“new-group”, myclock+Δ)
cycle
when receive(“new-group”,V)
do if myclock > V then abort fi;
cancel(Broadcast);
broadcast(“present”, V.myid);
M is the collection of
schedule(Broadcast, V+π) at V + π - η
“present” messages
od;
including myid. Line 13
when receive(“present”,V,M)
concerns the case myid is
do if ¬joined&(myid ε M) then joined ← true fi; the new group requestor
if members ≠ M then group ← V; members ← M fi
od
endcycle;
1)
task Broadcast(V:Time);
2)
if myclock > V then abort fi;
3)
broadcast(“present”,V.myid)
4)
schedule(Broadcast, V + π) at V + π - η
Attendance list protocol: processor
departure handling
1)After joining at V, the highest-ranked member
issues an attendance list message periodically at
time V+π, V+2π, ...
2)Each member must attach its Id to the
attendance list.
3)A member which detects the list loss triggers a
new group by an atomic broadcast.
Attendance list protocol: processor
departure handling
Attendance list protocol
●
●
●
The highest-ranked member circulates
attendance list along a virtual circuit connecting
the correct processors at time V.
Each member must attach its Id to the
attendance list by time V+nδ+ε
Reconfiguration latency for a single failure is
given by D1 = π+nδ+ε+J where J=2Δ. This
assumes that processor-to-processor delay is
between 0 and δ. D1 = π+δ+ε+J if the
processor-to-processor delay is exactly δ.
Attendance list protocol
Detection and Reconfiguration Latency
in the general case of k failures
●
●
●
Reconfiguration latency for k failures is denoted by Dk
Dk = 2π+(k-1)δ-(n-1)δ+ε+J where J=2Δ if the
processor-to-processor delay is exactly δ. This
assumes that π>δ so that member check-in can finish
before the end of the period.
Dk = 2π+(k-1)δ+ε+J if processor-to-processor delay is
between 0 and δ.
The worst-case detection latency occurs if the lowest
ranked member fails after passing the list in the first
period and the highest-ranked k-1 members fail in the
second period after they pass the list.
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
13)
14)
15)
16)
17)
18)
19)
20)
21)
22)
23)
24)
25)
task Membership;
var members: set-of-P initially {}
group: Time; L: Time initially -∞
joined: Boolean initially false;
broadcast(“new-group”, myclock+Δ)
cycle
when receive(“new-group”,V)
do if myclock > V then abort fi;
cancel(Membership-Check, Membership-Confirmation);
broadcast(“present”, V.myid);
schedule(Membership-Check, V+π) at V + π-η
od;
when receive(“present”,V,M)
do if ¬joined&(myid ε M) then joined ← true fi;
if members ≠ M then group ← V; members ← M fi
od
when receive(“list”, O)
do if myclock ≤ O + γ
then L ← O;
if myid ≠ max(members)
then send(“list”, O) to next(myid, members)
fi;
fi;
od;
endcycle
task Membership-Check(O:Time);
if myid = max(members) then send(“list”, O) to next(myid,members) fi;
schedule(Membership-Confirmation, O+γ+η) at O + γ;
schedule(Membership-Check, O + π) at O + π - η;
1)
task Membership-Confirmation(E: Time);
2)
if myclock > E then abort fi;
3)
if L+ γ < E then broadcast(“new-group”, E+Δ) fi
Neighbor surveillance protocol:
processor departure handling
1)After joining at time V, each member sends
“Present” message periodically to its successor
in virtual circuit at time V+π, V+2π,...
2)A member which does not receive “Present”
message from predecessor triggers a new
group by atomic broadcast.
● Reconfiguration latency is given by D = kπ + δ
+ ε + 2Δ, 2Δ=J for k successive failures.
Neighbor surveillance protocol:
processor departure handling
Neighbor surveillance protocol:
processor departure handling (cntd)
Neighbor surveillance protocol:
processor departure handling (cntd)
Neighbor surveillance protocol:
processor departure handling (cntd)
Neighbor surveillance protocol:
processor departure handling (cntd)
Neighbor surveillance protocol:
processor departure handling (cntd)
Group Atomic Broadcast





Group atomic broadcast targets a
subset of the processors in the
network
Groups can be formed dynamically
Group founder obtains a unique group
ID from a central registry and
advertises the group to all the
processors in the network
New_Group commands in group
membership management must
include the group’s ID
Negotiation among group members
may enforce group-specific properties
that are a function of the group’s
membership