CS 386C Synchronous Atomic Broadcast  A. Mok 2015 Synchronous Atomic Broadcast for Redundant Broadcast Channels Processor A Application process Processor B Application process SEND RECEIVE OS kernel OS kernel SEND DELIVER RECEIVE DELIVER send receive System Architecture.

Transcript CS 386C Synchronous Atomic Broadcast  A. Mok 2015 Synchronous Atomic Broadcast for Redundant Broadcast Channels Processor A Application process Processor B Application process SEND RECEIVE OS kernel OS kernel SEND DELIVER RECEIVE DELIVER send receive System Architecture.

CS 386C

Synchronous Atomic Broadcast  A. Mok 2015

Synchronous Atomic Broadcast for Redundant Broadcast Channels

Processor A Application process SEND Processor B Application process RECEIVE OS kernel SEND DELIVER OS kernel RECEIVE DELIVER send receive System Architecture

Synchronous Atomic Broadcast for Redundant Broadcast Channels

● Atomic broadcast satisfies the following properties:  Atomicity: If any correct processor delivers an update at time U, then that update was initiated by some processor and is delivered by all correct processors at time U.

 Order: All updates delivered by correct processors are delivered in the same order by each correct processor.

● Synchronous atomic broadcast also satisfies:  Termination: Every update whose broadcast is initiated by a correct processor at time T is delivered by all correct processors at time T+Δ.

System Assumptions

● n processors with distinct, totally ordered names and also bounded broadcast rates ● Channels suffer omission failures (upper bound

delay, but no atomicity) on ●

Out-adaptors suffer performance failures (normal bound on adaptor delay) ● In-adaptors suffer omission failures (upper bound

, can be relaxed) ● Processors suffer crash failures (upper bound

SEND and RECEIVE) on

Systems Assumptions (cntd)

● ● Processor clocks are correct and synchronized within ε At most

f ≤ n-2

failures during a broadcast ● DELIVER can be scheduled (by real-time executive) ● ● We shall use the end-to-end delay parameter, δ =

P+O+C+I+P

later.

Need for forwarding to achieve atomicity when

f > 1

What to do when a message (

T,s,σ)

that has not been seen before arrives?

Prompt forwarding rule:

forwards message received on any channel

to all other channels as soon as the message arrives.

In the absence of failure, prompt forwarding requires

1+f+(n-1)f = nf+1

broadcast.

messages for each

Need for forwarding to achieve atomicity when

f > 1 (cntd)

Lazy forwarding

The goal is not to have to forward messages when it is not necessary to do so, e.g., when no failure occurs.

Simple lazy forwarding rule

A sending processor

enqueues any message

to be broadcast on the out-adaptors to channels 1,2, ...,

f+1

in this order. Let

be an arbitrary processor different from

forwards

m s

that receives on channels

m c+1

, ...,

and let

be the highest channel number on which of

. If

c ≥ f

, then

p p

receives a copy does not need to forward

, else

Theorem

initiates broadcast of message

some correct processor

and (which does not fail in this broadcast) receives

and forwards

by following the simple lazy forwarding rule, then every correct

will also receive

has not failed, then

Proof

must have been sent on all channels. A least one of them must reach assume

has failed.

f+1 q

. So Case

c ≥ f

Subcase

c = f+1

A copy of

must have been sent on each of channels. At least one of them must reach

f+1

Proof (cntd)

Subcase

c = f

A copy of channels. If

s m

channels can.

must have been sent on each of has failed (count 1 failure), then not all

f f

Case

c < f r

forwards

so that

is sent on all has failed (count 1 failure) then not all

f f

channels. If channels can.

Key advantage of lazy forwarding

In the absence of failures, only

f+1

messages are sent. This implies scalability (because of independence from

f+1

is also the minimum number of messages required.

Theorem

f+1

is the minimum number of messages that must be sent each broadcast to achieve atomicity and termination.

Lower bound on messages sent is f+1

For example, consider the case f = 2 and only 2 messages are sent. In the above scenario, at least two of {p, q, r}, say q and r must have not sent any message, that is, they must receive the update from some other processor. However, q but not r will receive the message, if r loses both in-adaptors.

A simple case: single-fault tolerance

● No forwarding is needed.

● To ensure order, order delivery by timestamp and processor name. Each processor that receives

timestamped

must wait till

T+δ+ε

before delivering

to a process. This ensures that all messages timestamped

have arrived before any of them is delivered.

 δ (end-to-end delay) =

P+O+C+I+P

 ε = maximum deviation between processor clocks

The single-fault tolerant protocol

task

Start; 2) c

onst Δ = δ +ε

Var

T:Time; σ: Update; s: Processor; 4)

cycle

SEND(σ); T ← clock; 5) 6) 7) for c = 1

do send

(T,myid,σ)

c; H ← H  (T,myid,σ); s

chedule

Deliver(T)

Τ+Δ;

8) endcycle

The single-fault tolerant protocol (cntd)

1) 2) c

onst Δ = δ + ε

; 3)

var

U,T: Time; σ: Update; s: Processor; 4)

task

Receive;

cycle receive

(T,s,σ)

from

c; U ← clock; 5) 6) 7) 8)

U ≥ T + Δ

then

"late message"

iterate f

T ε dom(H) & s H ← H  (T,s,σ); ε dom(H(T))

then

"deja vu"

iterate f

i; s

chedule

Deliver(T) at T+Δ; 9)

endcycle

6) 7) 8) 9) 1) 2) 3) 4) 5)

The single-fault tolerant protocol (cntd)

ask

Deliver(T:Time); v

p: Processor; val: Processor→Update; v

← H(T); w

hile

dom(val) ≠ { }

p ← min(dom(val)); RECEIVE(val(p)); val ← p;

H ← H \ T;

The single-fault tolerant protocol

Task

Start; 1.

onst Δ = δ +ε Var

T:Time; σ: Update; s: Processor; 3.

cycle

SEND(σ); T ← clock; for c = 1

do send

(T,myid,σ)

H ← H  (T,myid,σ); s

chedule

Deliver(T)

Τ+Δ;

endcycle

Task

Deliver(T:Time); 1.

p: Processor; val: Processor→Update; 2.

← H(T); w

hile

dom(val) ≠ { }

p ← min(dom(val)); 5.

RECEIVE(val(p)); val ← p; H ← H \ T;

Task

Receive; 1.

onst Δ = δ + ε

;

var

U,T: Time; σ: Update; s: Processor;

cycle receive

(T,s,σ)

from

c; U ← clock;

U ≥ T + Δ

then iterate f

i; "late message" 5.

T ε dom(H) & s ε dom(H(T))

then

"deja vu"

iterate f

i; H ← H  (T,s,σ); s

chedule

Deliver(T) at T+Δ;

endcycle

The general case: tolerance to up to f faults

Ideas: Discard late messages by using hop count

. A message time-stamped

received at local time

with hop count

is timely if

U < T+h(δ+ε).

If a correct processor forwards a timely message, then the forwarded message will be timely for all correct processors.

Ideas (cntd)

If nothing goes wrong, broadcast will be accomplished in one hop. If (

T,s,σ,h

h>1

, is accepted by processor

, then there must have been some faults since broadcast starts.

Accordingly,

p f+1

does not need to forward on all channels. In case forwarding is needed, we shall show that there must have been at least

faults and so

only needs to ensure that there are at least

f+1-h

copies of the message that can reach the remaining correct processors.

Lazy forwarding rule

To initiate a broadcast, a sender

enqueues messages (

Τ,s,σ,1

) on channels 1, 2, ...,

f+1

that order. Let (

T,s,σ,h), h ≤ k

, be a message in accepted by a processor

p ≠ s

, and let

be the highest channel on which

receives a copy of the message by local time

T+h(δ+ε)

. If

c < f+1-h

T+h(δ+ε)

's clock then

forwards (

T,s,σ,h+1

) on channels

c+1, ..., f+1-h

Termination time

Δ ≤ w + δ +

ε

where

= worst-case-delay-to-first-correct-processor In general,

(δ +ε) if

f = 2k

(

+1)(δ+ ε) if

f=2k+1

Worst-case-delay-to-first-correct-processor scenario (e.g.,

f=5

Termination time

Termination time (cont.)

Δ can actually be set to

(k+1)(δ+ε)

where

k=[f/2],

regardless of whether

is even or odd.

Example

(f=6, need 7 channels)

Definitions

Suppose the original sender

is processor

p 0

Let

c h

be the highest-numbered channel on which a correct processor

p h

receives a hop-

message

(T,s,σ,h)

with

h ≤ k,

where

k = [f/2].

Recursively define

p i , c i , i = h-1, ...,1

the processor that sends the message

(T,s,σ,i+1)

and the highest-numbered channel on which message

(T,s,σ,i)

c i

backward as follows:

p i

from

p i p i

is to

p i+1

received the

hop-i

p 1 is the processor that received a hop-1 message from the sender

s = p 0

and it sends the message

(T,s,σ,2)

p 2

. The highest numbered channel on which

(T,s,σ,1)

c 1

p 1

has received message Notice that

c i > c j ,

for all

i>j

Definitions (continued)

Let

A i

be the set of components: { (1)

p i-1

, (2)

p i-1

's out-adaptor to

c i +1

, (3) channel

c i +1

, (4)

p i

's in-adaptor to

c i +1

} These four components are in red in the picture below.

Recursively let

C 1

A 1, C i+1

C i



A i+1

Notice that

C i

A 1



A 2

... 

A i

and the sets

C i

and

A i+1

are disjoint.

A 1 P 0 A 2 P 1 A i P i-1 P i

Theorem

Suppose by local time

= T+h(δ+ε)

, processor p h accepts a message

(T,s,σ,h)

. If

c h ≤ f+1-h

, then there are at least

faults in the set

C h

then there are at least

h-1

by time faults in

C h

T h

. If

c h > f+1-h

, A 1 P 0 A 2 P 1 A h P h-1 P h C h = A 1  A 2 ...  A h A h includes the sender of the hop-h message. P h receives the hop-h message .

Induction Proof

● Theorem: least

Suppose by local time accepts a message

(T,s,σ,h)

. If

c h

faults in the set

C h

by time

T ≤ f+1-h T h

= T+h(δ+ε)

, processor p h . If

c h

, then there are at

> f+1-h

, then there are at least

h-1

faults in

C h

Base step

h = 1



p 1 did not receive on channel f+1 if c 1 of the components in C 1 ≤ f+1-1 = f. Hence one must have failed. Else, nothing has failed and there is h-1 = 0 faults.

A 1 P 0 A 2 P 1 A i+1 P i P i+1

Induction Proof (continued)

● Induction step assumes case holds for

h = i



Suppose by time T i+1 = T+(i+1)(δ+ε), p i+1 to forward the message means c i at least i faults. received (T,s,σ,i+1) on channels no higher than c i+1 . By definition, p i forwarded (T,s,σ,i+1) to p i+1 on c i must have +1, ..., f+1-i. The need for p i < f+1-i and therefore C i has



If c i+1 ≤ f+1-h =f+1-(i+1) = f-i, then p i+1 did not receive on channel f+1-i. This means either p i fault with the out-adaptor of p i has crashed, or there is a on channel c i+1 +1, or the channel, c i+1 + 1, or the in-adaptor of p i+1 there is a fault in A i+1 . Since p i on c forwards to p i+1 i+1 + 1 .Thus on channels > c i , we have c i+1 C i+1 = C i > c i and the sets C i and A i+1 are disjoint. Since



A i+1 , there are at least i+1 faults in C i+1 .



If c i+1 C i+1 > f+1-h, we only need to show h-1 = i+1-1 = i faults in ,but this is already assured by the induction hypothesis which assumes that C i has at least i faults, and so must C i+1 .

The Unanimity Property

If a correct processor

(one that does not crash during the broadcast) accepts a broadcast by time

T+ Δ

's clock, then each correct processor

broadcast by time

T+ Δ

's clock.

accepts the A 1 P 0 A 2 P 1 A i+1 P i P i+1 C h = A 1  A 2 ...  A h

Unanimity Proof (continued)

Case

c = f+1-h

: A message has already been sent on channels

1, ..., f+1-h

. There are at most

f-h

faults which can affect any correct processor not in C h .

Case

c > f+1-h

: Then there are at least

h-1

faults in

C h

, i.e., there are at most

f-(h-1)

that can affect processors not in

C h

. However, at least

f+1-h +1

faults message have been sent. One of them must reach a correct processor.

A 1 P 0 A 2 P 1 A i+1 P i P i+1 C h = A 1  A 2 ...  A h

The Unanimity Proof (continued)

Suppose by local time

Τ h = T+h(δ+ε)

, a correct processor

accepts a message

(T,s,σ,h)

. Let

be the highest channel that receives a copy of the message.

● Case

c < f+1-h

: Then there are at least the processors in

C h h

faults in the set

C h

in the theorem have all received the broadcast message, i.e., the

by time

T h

. Notice that faults do not involve any processor (or its in-adaptor) that has not received a copy by time

T h

. Also, all the channels in

C h

that may be faulty

{c i +1,i≤h}

have messages forwarded on them by the corresponding

p i

s. Since

sends messages on all of channels

c+1

is correct, it to

f+1-h

. Thus there is a message (not corrupted by any of the

1, ..., c, c+1, ..., f+1-h h

failures in

C h

) on channels for processors not in

C h

to receive.

Since at most

f-h

one of the

f+1-h

faults can affect any processor not already in

C h

, channels must reach any of these processors.

The

f

-fault tolerant protocol

task

Start; 2) c

onst

Δ = [ f/2](δ+ε) + (δ+ε); 3)

var

T: Time; σ: Update; s: Processor; 4)

cycle

SEND(σ); T ← clock; 5) 6) 7) f

c = 1 to f + 1

do send(

T,myid,σ,1) on c; H ← H  (T,myid,σ);

schedule

Deliver(T)

T + Δ; 8)

endcycle

The

f

-fault tolerant protocol (cntd)

const

Δ = [ f/2 ](δ+ε) + (δ+ε); 4.

9) 10) 11) 12) 13) 14)

15) task

Receive;

var

U,T: Time; σ: Update; s: Processor; h: Integer;

cycle receive

(T,s,σ,h) from c; U ← clock;

U ≥ T + Δ

then

"late message"

iterate fi

;

U ≥ T + h(δ+ε)

then

"too late to forward"

Iiterate fi

;

T  dom(H) & s  dom(H(T))

then

"deja vu" C(T)(s) ← max{c,C(T)(s)};

else

H ← H  (T,s,σ);

h ≤ [

/2 ] & c < f + 1 – h t

hen

C ← C  (T,s,c);

schedule

Forward(T,s,h)

T+h(δ+ε);

fi schedule

Deliver(T) at T + Δ;

fi endcycle

The

-fault tolerant protocol (continued)

1) task

Forward(T: Time; s: Processor; h: Integer);

2) if

C(T)(s) < f+1-h 3)

then

for i = C(T)(s) + 1

f+1-h

do send

(T,s,H(T)(s),h+1)

; "H(T)(p) = the update σ broadcast by p at time t." "C(T)(s) = highest channel on which a message (T,s,*,*) was received."

Unanimity Compromise

The unanimity may be violated if correct processors do not wait long enough for messages to arrive on all channels before forwarding. For example, if processors wait till T+hδ+ε for hop-h message times-tamped T, instead of waiting till T+h(δ+ε), then unanimity may be compromised. Consider an example where f = 2, ε =2, δ =4

Unanimity Compromise

Let Clock1(0)= 0,Clock2(0)= 1, Clock3(0)= 0, Clock4(0)= 2 Unanimity can be violated by a processor performance failure as shown above. Here f=2, ε=2, δ=4 and processors wait till T+hδ+ε for hop-h message time-stamped T, instead of waiting till T+h(δ+ε)

Unanimity Compromise (continued)

At time=0 At time=11 S C S (0)=0 E C E (0)=0 Actual transmissions take 11 instead of δ time units E C E (11)=13 L C L (0)=0 L C L (11)=11 S takes 11 time units to send message to E and L, whereas δ=10, ε=2, i.e., the message should arrive at E and L at their local time 12 in the worst case. E will reject but L will accept the message.

CS 386C Synchronous Atomic Broadcast  A. Mok 2015 Synchronous Atomic Broadcast for Redundant Broadcast Channels Processor A Application process Processor B Application process SEND RECEIVE OS kernel OS kernel SEND DELIVER RECEIVE DELIVER send receive System Architecture.

Transcript CS 386C Synchronous Atomic Broadcast  A. Mok 2015 Synchronous Atomic Broadcast for Redundant Broadcast Channels Processor A Application process Processor B Application process SEND RECEIVE OS kernel OS kernel SEND DELIVER RECEIVE DELIVER send receive System Architecture.

CS 386C

Synchronous Atomic Broadcast for Redundant Broadcast Channels

Synchronous Atomic Broadcast for Redundant Broadcast Channels

System Assumptions

Systems Assumptions (cntd)

Need for forwarding to achieve atomicity when

f > 1

Need for forwarding to achieve atomicity when

f > 1 (cntd)

Lazy forwarding

Simple lazy forwarding rule

Theorem

Proof

Proof (cntd)

Key advantage of lazy forwarding

Lower bound on messages sent is f+1

A simple case: single-fault tolerance

The single-fault tolerant protocol

The single-fault tolerant protocol (cntd)

The single-fault tolerant protocol (cntd)

The single-fault tolerant protocol

The general case: tolerance to up to f faults

Ideas (cntd)

Lazy forwarding rule

Termination time

Δ ≤ w + δ +

ε

Termination time

Termination time (cont.)

Example

Definitions

Definitions (continued)

Theorem

Induction Proof

Induction Proof (continued)

The Unanimity Property

Unanimity Proof (continued)

The Unanimity Proof (continued)

The

f

-fault tolerant protocol

The

f

-fault tolerant protocol (cntd)

The

-fault tolerant protocol (continued)

Unanimity Compromise

Unanimity Compromise

Unanimity Compromise (continued)

Directory