CS 386C Synchronous Atomic Broadcast  A. Mok 2015 Synchronous Atomic Broadcast for Redundant Broadcast Channels Processor A Application process Processor B Application process SEND RECEIVE OS kernel OS kernel SEND DELIVER RECEIVE DELIVER send receive System Architecture.

Download Report

Transcript CS 386C Synchronous Atomic Broadcast  A. Mok 2015 Synchronous Atomic Broadcast for Redundant Broadcast Channels Processor A Application process Processor B Application process SEND RECEIVE OS kernel OS kernel SEND DELIVER RECEIVE DELIVER send receive System Architecture.

CS 386C

Synchronous Atomic Broadcast  A. Mok 2015

Synchronous Atomic Broadcast for Redundant Broadcast Channels

Processor A Application process SEND Processor B Application process RECEIVE OS kernel SEND DELIVER OS kernel RECEIVE DELIVER send receive System Architecture

Synchronous Atomic Broadcast for Redundant Broadcast Channels

● Atomic broadcast satisfies the following properties:  Atomicity: If any correct processor delivers an update at time U, then that update was initiated by some processor and is delivered by all correct processors at time U.

 Order: All updates delivered by correct processors are delivered in the same order by each correct processor.

● Synchronous atomic broadcast also satisfies:  Termination: Every update whose broadcast is initiated by a correct processor at time T is delivered by all correct processors at time T+Δ.

System Assumptions

● n processors with distinct, totally ordered names and also bounded broadcast rates ● Channels suffer omission failures (upper bound

C

delay, but no atomicity) on ●

O

Out-adaptors suffer performance failures (normal bound on adaptor delay) ● In-adaptors suffer omission failures (upper bound

I

, can be relaxed) ● Processors suffer crash failures (upper bound

P

SEND and RECEIVE) on

Systems Assumptions (cntd)

● ● Processor clocks are correct and synchronized within ε At most

f ≤ n-2

failures during a broadcast ● DELIVER can be scheduled (by real-time executive) ● ● We shall use the end-to-end delay parameter, δ =

P+O+C+I+P

later.

Need for forwarding to achieve atomicity when

f > 1

What to do when a message (

T,s,σ)

that has not been seen before arrives?

Prompt forwarding rule:

p

forwards message received on any channel

c

to all other channels as soon as the message arrives.

In the absence of failure, prompt forwarding requires

1+f+(n-1)f = nf+1

broadcast.

messages for each

Need for forwarding to achieve atomicity when

f > 1 (cntd)

Lazy forwarding

The goal is not to have to forward messages when it is not necessary to do so, e.g., when no failure occurs.

Simple lazy forwarding rule

A sending processor

s

enqueues any message

m

to be broadcast on the out-adaptors to channels 1,2, ...,

f+1

in this order. Let

p

be an arbitrary processor different from

p

forwards

m s

that receives on channels

m c+1

, ...,

f

.

and let

c

be the highest channel number on which of

m

. If

c ≥ f

, then

p p

receives a copy does not need to forward

m

, else

Theorem

If

s

initiates broadcast of message

m

some correct processor

r

and (which does not fail in this broadcast) receives

m

and forwards

m

by following the simple lazy forwarding rule, then every correct

q

will also receive

m

.

If

s

has not failed, then

m

Proof

must have been sent on all channels. A least one of them must reach assume

s

has failed.

f+1 q

. So Case

c ≥ f

Subcase

c = f+1

A copy of

m

must have been sent on each of channels. At least one of them must reach

q

.

f+1

Proof (cntd)

Subcase

c = f

A copy of channels. If

s m

channels can.

must have been sent on each of has failed (count 1 failure), then not all

f f

Case

c < f r

forwards

m

so that

m

is sent on all has failed (count 1 failure) then not all

f f

channels. If channels can.

s

Key advantage of lazy forwarding

In the absence of failures, only

f+1

messages are sent. This implies scalability (because of independence from

n

).

f+1

is also the minimum number of messages required.

Theorem

f+1

is the minimum number of messages that must be sent each broadcast to achieve atomicity and termination.

Lower bound on messages sent is f+1

For example, consider the case f = 2 and only 2 messages are sent. In the above scenario, at least two of {p, q, r}, say q and r must have not sent any message, that is, they must receive the update from some other processor. However, q but not r will receive the message, if r loses both in-adaptors.

A simple case: single-fault tolerance

● No forwarding is needed.

● To ensure order, order delivery by timestamp and processor name. Each processor that receives

m

timestamped

T

must wait till

T+δ+ε

before delivering

m

to a process. This ensures that all messages timestamped

T

have arrived before any of them is delivered.

 δ (end-to-end delay) =

P+O+C+I+P

 ε = maximum deviation between processor clocks

The single-fault tolerant protocol

1)

task

Start; 2) c

onst Δ = δ +ε

3)

Var

T:Time; σ: Update; s: Processor; 4)

cycle

SEND(σ); T ← clock; 5) 6) 7) for c = 1

to

2

do send

(T,myid,σ)

on

c; H ← H  (T,myid,σ); s

chedule

Deliver(T)

at

Τ+Δ;

8) endcycle

The single-fault tolerant protocol (cntd)

1) 2) c

onst Δ = δ + ε

; 3)

var

U,T: Time; σ: Update; s: Processor; 4)

task

Receive;

cycle receive

(T,s,σ)

from

c; U ← clock; 5) 6) 7) 8)

if

U ≥ T + Δ

then

"late message"

iterate f

i;

if

T ε dom(H) & s H ← H  (T,s,σ); ε dom(H(T))

then

"deja vu"

iterate f

i; s

chedule

Deliver(T) at T+Δ; 9)

endcycle

6) 7) 8) 9) 1) 2) 3) 4) 5)

The single-fault tolerant protocol (cntd)

t

ask

Deliver(T:Time); v

ar

p: Processor; val: Processor→Update; v

al

← H(T); w

hile

dom(val) ≠ { }

do

p ← min(dom(val)); RECEIVE(val(p)); val ← p;

od

H ← H \ T;

The single-fault tolerant protocol

Task

Start; 1.

2.

c

onst Δ = δ +ε Var

T:Time; σ: Update; s: Processor; 3.

4.

5.

6.

7.

cycle

SEND(σ); T ← clock; for c = 1

to

2

do send

(T,myid,σ)

on

H ← H  (T,myid,σ); s

chedule

Deliver(T)

at

Τ+Δ;

endcycle

c;

Task

Deliver(T:Time); 1.

8.

v

ar

p: Processor; val: Processor→Update; 2.

3.

4.

v

al

← H(T); w

hile

dom(val) ≠ { }

do

p ← min(dom(val)); 5.

6.

7.

8.

od

RECEIVE(val(p)); val ← p; H ← H \ T;

Task

Receive; 1.

2.

3.

4.

c

onst Δ = δ + ε

;

var

U,T: Time; σ: Update; s: Processor;

cycle receive

(T,s,σ)

from

c; U ← clock;

if

U ≥ T + Δ

then iterate f

i; "late message" 5.

6.

7.

if

T ε dom(H) & s ε dom(H(T))

then

"deja vu"

iterate f

i; H ← H  (T,s,σ); s

chedule

Deliver(T) at T+Δ;

endcycle

The general case: tolerance to up to f faults

Ideas: Discard late messages by using hop count

h

. A message time-stamped

T

received at local time

U

with hop count

h

is timely if

U < T+h(δ+ε).

If a correct processor forwards a timely message, then the forwarded message will be timely for all correct processors.

Ideas (cntd)

If nothing goes wrong, broadcast will be accomplished in one hop. If (

T,s,σ,h

),

h>1

, is accepted by processor

p

, then there must have been some faults since broadcast starts.

Accordingly,

p f+1

does not need to forward on all channels. In case forwarding is needed, we shall show that there must have been at least

h

faults and so

p

only needs to ensure that there are at least

f+1-h

copies of the message that can reach the remaining correct processors.

Lazy forwarding rule

To initiate a broadcast, a sender

s

enqueues messages (

Τ,s,σ,1

) on channels 1, 2, ...,

f+1

that order. Let (

T,s,σ,h), h ≤ k

, be a message in accepted by a processor

p ≠ s

, and let

c

be the highest channel on which

p

receives a copy of the message by local time

T+h(δ+ε)

. If

c < f+1-h

at

T+h(δ+ε)

on

p

's clock then

p

forwards (

T,s,σ,h+1

) on channels

c+1, ..., f+1-h

.

Termination time

Δ ≤ w + δ +

ε

where

w

= worst-case-delay-to-first-correct-processor In general,

w

=

k

(δ +ε) if

f = 2k

(

k

+1)(δ+ ε) if

f=2k+1

Worst-case-delay-to-first-correct-processor scenario (e.g.,

f=5

):

Termination time

Termination time (cont.)

Δ can actually be set to

(k+1)(δ+ε)

where

k=[f/2],

regardless of whether

f

is even or odd.

Example

(f=6, need 7 channels)

Definitions

Suppose the original sender

s

is processor

p 0

Let

c h

be the highest-numbered channel on which a correct processor

p h

receives a hop-

h

message

(T,s,σ,h)

with

h ≤ k,

where

k = [f/2].

Recursively define

p i , c i , i = h-1, ...,1

the processor that sends the message

(T,s,σ,i+1)

and the highest-numbered channel on which message

(T,s,σ,i)

is

c i

backward as follows:

p i

from

p i p i

is to

p i+1

received the

hop-i

p 1 is the processor that received a hop-1 message from the sender

s = p 0

and it sends the message

(T,s,σ,2)

to

p 2

. The highest numbered channel on which

(T,s,σ,1)

is

c 1

.

p 1

has received message Notice that

c i > c j ,

for all

i>j

.

Definitions (continued)

Let

A i

be the set of components: { (1)

p i-1

, (2)

p i-1

's out-adaptor to

c i +1

, (3) channel

c i +1

, (4)

p i

's in-adaptor to

c i +1

} These four components are in red in the picture below.

Recursively let

C 1

=

A 1, C i+1

=

C i

A i+1

Notice that

C i

=

A 1

A 2

... 

A i

and the sets

C i

and

A i+1

are disjoint.

A 1 P 0 A 2 P 1 A i P i-1 P i

Theorem

Suppose by local time

T

h

= T+h(δ+ε)

, processor p h accepts a message

(T,s,σ,h)

. If

c h ≤ f+1-h

, then there are at least

h

faults in the set

C h

then there are at least

h-1

by time faults in

C h

.

T h

. If

c h > f+1-h

, A 1 P 0 A 2 P 1 A h P h-1 P h C h = A 1  A 2 ...  A h A h includes the sender of the hop-h message. P h receives the hop-h message .

Induction Proof

● Theorem: least

h

Suppose by local time accepts a message

(T,s,σ,h)

. If

c h

faults in the set

C h

by time

T ≤ f+1-h T h

h

= T+h(δ+ε)

, processor p h . If

c h

, then there are at

> f+1-h

, then there are at least

h-1

faults in

C h

.

Base step

h = 1

p 1 did not receive on channel f+1 if c 1 of the components in C 1 f+1-1 = f. Hence one must have failed. Else, nothing has failed and there is h-1 = 0 faults.

A 1 P 0 A 2 P 1 A i+1 P i P i+1

Induction Proof (continued)

● Induction step assumes case holds for

h = i

Suppose by time T i+1 = T+(i+1)(δ+ε), p i+1 to forward the message means c i at least i faults. received (T,s,σ,i+1) on channels no higher than c i+1 . By definition, p i forwarded (T,s,σ,i+1) to p i+1 on c i must have +1, ..., f+1-i. The need for p i < f+1-i and therefore C i has

If c i+1 f+1-h =f+1-(i+1) = f-i, then p i+1 did not receive on channel f+1-i. This means either p i fault with the out-adaptor of p i has crashed, or there is a on channel c i+1 +1, or the channel, c i+1 + 1, or the in-adaptor of p i+1 there is a fault in A i+1 . Since p i on c forwards to p i+1 i+1 + 1 .Thus on channels > c i , we have c i+1 C i+1 = C i > c i and the sets C i and A i+1 are disjoint. Since

A i+1 , there are at least i+1 faults in C i+1 .

If c i+1 C i+1 > f+1-h, we only need to show h-1 = i+1-1 = i faults in ,but this is already assured by the induction hypothesis which assumes that C i has at least i faults, and so must C i+1 .

The Unanimity Property

If a correct processor

p

(one that does not crash during the broadcast) accepts a broadcast by time

T+ Δ

on

p

's clock, then each correct processor

q

broadcast by time

T+ Δ

on

q

's clock.

accepts the A 1 P 0 A 2 P 1 A i+1 P i P i+1 C h = A 1  A 2 ...  A h

Unanimity Proof (continued)

Case

c = f+1-h

: A message has already been sent on channels

1, ..., f+1-h

. There are at most

f-h

faults which can affect any correct processor not in C h .

Case

c > f+1-h

: Then there are at least

h-1

faults in

C h

, i.e., there are at most

f-(h-1)

that can affect processors not in

C h

. However, at least

f+1-h +1

faults message have been sent. One of them must reach a correct processor.

A 1 P 0 A 2 P 1 A i+1 P i P i+1 C h = A 1  A 2 ...  A h

The Unanimity Proof (continued)

Suppose by local time

Τ h = T+h(δ+ε)

, a correct processor

p

accepts a message

(T,s,σ,h)

. Let

c

be the highest channel that receives a copy of the message.

● Case

c < f+1-h

: Then there are at least the processors in

C h h

faults in the set

C h

in the theorem have all received the broadcast message, i.e., the

h

by time

T h

. Notice that faults do not involve any processor (or its in-adaptor) that has not received a copy by time

T h

. Also, all the channels in

C h

that may be faulty

{c i +1,i≤h}

have messages forwarded on them by the corresponding

p i

s. Since

p

sends messages on all of channels

c+1

is correct, it to

f+1-h

. Thus there is a message (not corrupted by any of the

1, ..., c, c+1, ..., f+1-h h

failures in

C h

) on channels for processors not in

C h

to receive.

Since at most

f-h

one of the

f+1-h

faults can affect any processor not already in

C h

, channels must reach any of these processors.

The

f

-fault tolerant protocol

1)

task

Start; 2) c

onst

Δ = [ f/2](δ+ε) + (δ+ε); 3)

var

T: Time; σ: Update; s: Processor; 4)

cycle

SEND(σ); T ← clock; 5) 6) 7) f

or

c = 1 to f + 1

do send(

T,myid,σ,1) on c; H ← H  (T,myid,σ);

schedule

Deliver(T)

at

T + Δ; 8)

endcycle

The

f

-fault tolerant protocol (cntd)

const

Δ = [ f/2 ](δ+ε) + (δ+ε); 4.

5.

6.

1.

2.

3.

7.

8.

9) 10) 11) 12) 13) 14)

15) task

Receive;

var

U,T: Time; σ: Update; s: Processor; h: Integer;

cycle receive

(T,s,σ,h) from c; U ← clock;

if

U ≥ T + Δ

then

"late message"

iterate fi

;

if

U ≥ T + h(δ+ε)

then

"too late to forward"

Iiterate fi

;

if

T  dom(H) & s  dom(H(T))

then

"deja vu" C(T)(s) ← max{c,C(T)(s)};

else

H ← H  (T,s,σ);

if

h ≤ [

f

/2 ] & c < f + 1 – h t

hen

C ← C  (T,s,c);

schedule

Forward(T,s,h)

at

T+h(δ+ε);

fi schedule

Deliver(T) at T + Δ;

fi endcycle

The

f

-fault tolerant protocol (continued)

1) task

Forward(T: Time; s: Processor; h: Integer);

2) if

C(T)(s) < f+1-h 3)

then

for i = C(T)(s) + 1

to

f+1-h

do send

(T,s,H(T)(s),h+1)

on

i

fi

; "H(T)(p) = the update σ broadcast by p at time t." "C(T)(s) = highest channel on which a message (T,s,*,*) was received."

Unanimity Compromise

The unanimity may be violated if correct processors do not wait long enough for messages to arrive on all channels before forwarding. For example, if processors wait till T+hδ+ε for hop-h message times-tamped T, instead of waiting till T+h(δ+ε), then unanimity may be compromised. Consider an example where f = 2, ε =2, δ =4

Unanimity Compromise

Let Clock1(0)= 0,Clock2(0)= 1, Clock3(0)= 0, Clock4(0)= 2 Unanimity can be violated by a processor performance failure as shown above. Here f=2, ε=2, δ=4 and processors wait till T+hδ+ε for hop-h message time-stamped T, instead of waiting till T+h(δ+ε)

Unanimity Compromise (continued)

At time=0 At time=11 S C S (0)=0 E C E (0)=0 Actual transmissions take 11 instead of δ time units E C E (11)=13 L C L (0)=0 L C L (11)=11 S takes 11 time units to send message to E and L, whereas δ=10, ε=2, i.e., the message should arrive at E and L at their local time 12 in the worst case. E will reject but L will accept the message.