www.faculty.idc.ac.il
Download
Report
Transcript www.faculty.idc.ac.il
Synchronization Algorithms
and Concurrent Programming
Gadi Taubenfeld
Chapter 5
Barrier Synchronization
Version: June 2014
This presentation is a Synchronization
modified version
of a presentation
thatProgramming
Itai Avrian and Shachar Gidron
Algorithms
and Concurrent
1
prepared for my Seminar inGadi
Concurrent
Distributed Computing, 2012.
Taubenfeldand
© 2014
Chapter 5
Synchronization Algorithms
and Concurrent Programming
ISBN: 0131972596, 1st edition
A note on the use of these ppt slides:
I am making these slides freely available to all (faculty, students, readers).
They are in PowerPoint form so you can add, modify, and delete slides and slide
content to suit your needs. They obviously represent a lot of work on my part.
In return for use, I only ask the following:
That you mention their source, after all, I would like people to use my book!
That you note that they are adapted from (or perhaps identical to)
my slides, and note my copyright of this material.
Thanks and enjoy!
Gadi Taubenfeld
All material copyright 2014
Gadi Taubenfeld, All Rights Reserved
To get the most updated version of these slides go to:
http://www.faculty.idc.ac.il/gadi/book.htm
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
2
Chapter 5
Barrier Synchronization
5.1 Barriers
5.2 Atomic Counter
5.3 Test-and-set Bits
5.4 Combining Tree Barrier*
5.5 A Tree-based Barriers
5.6 The Dissemination Barrier*
5.7 The See-Saw Barrier
5.8 Semaphores
5.9 Bibliographic Notes*
5.10 Problems*
* Not covered in this presentation
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
3
Definition and Motivation
Barrier Synchronization
Chapter 5
Synchronization Algorithms and Concurrent
Programming Gadi Taubenfeld © 2014
4
What is a Barrier ?
P3
P4
P1
P2
P2
P3
Barrier
Barrier
P2
P1
Barrier
P1
P4
P3
P4
time
four processes
approach the
barrier
Chapter 5
all except
P4 arrive
Once all
arrive, they
continue
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
5
What is a Barrier ?
A barrier is a coordination mechanism (an algorithm),
that forces processes which participate in a concurrent
(or distributed) algorithm to wait until each one of
them has reached a certain point in its program.
The collection of this coordination points is called the
barrier.
Once all the processes have reached the barrier, they
are all permitted to continue pass the barrier.
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
6
Example: Parallel Prefix Sum
begin
a
b
c
d
e
f
time
end
Chapter 5
a
a+b
a+b+c
a+b+c+d
a+b+c
+d+e
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
a+b+c
+d+e+f
7
Example: Parallel Prefix Sum
begin
end
Chapter 5
a
b
c
d
e
f
a
a+b
c
d
e
f
a
a+b
a+b+c
d
e
f
a
a+b
a+b+c
a+b+c+d
e
f
a
a+b
a+b+c
a+b+c+d a+b+c+d+e
a
a+b
a+b+c
a+b+c+d
a+b+c
+d+e
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
time
f
a+b+c
+d+e+f
8
Example: Parallel Prefix Sum
begin
a
b
c
d
e
f
a
a+b
b+c
c+d
d+e
e+f
time
end
Chapter 5
a
a+b
a+b+c
a+b+c+d b+c+d+e c+d+e+f
a
a+b
a+b+c
a+b+c+d
a+b+c
+d+e
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
a+b+c
+d+e+f
9
Example: Parallel Prefix Sum
begin
a
b
c
d
e
f
a
a+b
b+c
c+d
d+e
e+f
barrier
time
barrier
end
Chapter 5
a
a+b
a+b+c
a+b+c+d b+c+d+e c+d+e+f
a
a+b
a+b+c
a+b+c+d
a+b+c
+d+e
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
a+b+c
+d+e+f
10
Example: Video
Single thread
Assume we have a video application
Each frame needs to be calculated,
before being displayed
Prepare frame for display by graphics processor
while (true)
{
frame = prepare_next_frame();
frame.display();
}
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
11
Example: Video
Multiple threads
Now, we have n threads running in parallel
It makes sense to split the frame into n disjoint parts
Each thread prepares its own parts in parallel with others
Each thread may run on different graphical processor
Barrier globalBarrier;
i = getThreadID();
while (true)
{
frame[ i ].prepare();
globalBarrier.await();
frame[ i ].display();
}
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
12
Where it is needed
Scientific & numeric computation
Computer graphics
Garbage collections
Parallel computing in general
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
13
Various Barrier Goals
Ideally when designing barriers, we would like to have the
following properties:
Low shared memory space complexity
Low contention on shared objects
Low shared memory reference per process
No need for shared memory initialization
Symmetric-ness (same amount of work for all processes)
Algorithm simplicity
Simple basic primtive
Minimal propagation time
Reusability of the barrier (must!)
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
14
Data Objects in Use
Atomic Bit
Atomic Register
Fetch-and-increment register
Test and set bits
Read-Modify-Write register
Semaphores
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
15
Barriers using atomic counters
Section 5.2
Atomic Bit
Atomic Register
Fetch-and-increment register / atomic counter
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
16
Fetch-and-increment Register
A shared register that supports a F&I operation:
Input: register
r
Atomic operation:
r is incremented by 1
the old value of r is returned
function fetch-and-increment (r : register)
orig_r := r;
r:= r + 1;
return (orig_r);
end-function
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
17
await macro
For clarity, we use the await macro
Not an operation of an object
This is also called: “spinning”
macro await (condition : boolean condition)
repeat
cond = eval(condition);
until (cond)
end-macro
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
18
Simple Barrier Using an Atomic Counter
Program of a Process
shared
counter: fetch and increment reg. – {0,..n}, initially = 0
go: atomic bit, initial value is immaterial
local
local.go: a bit, initial value is immaterial
local.counter: register
1
local.go := go
2
local.counter := fetch-and-increment (counter)
3
if local.counter + 1 = n then
4
counter := 0
5
go := 1 - go
6
Chapter 5
else await(local.go ≠ go) fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
19
Simple Barrier Using an Atomic Counter
Run for n=2 Processes
counter
?
go
local.go
?
local.counter
?
P1
SM
local.go
?
local.counter
?
1
local.go := go
2
local.counter := fetch-and-increment (counter)
3
if local.counter + 1 = n then
4
counter := 0
5
go := 1 - go
6
Chapter 5
?
P2
else await(local.go ≠ go) fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
20
Simple Barrier Using an Atomic Counter
Run for n=2 Processes
counter
P2
P1
local.go
?
0
local.counter
?
0
P1
SM
local.go
?
0
local.counter
?
1
1
local.go := go
2
local.counter := fetch-and-increment (counter)
0+1≠2
1+1=2
3
if local.counter + 1 = n then
4
counter := 0
5
go := 1 - go
6
Chapter 5
0
1
go
0
2
1
P2
P1 Busy wait
else await(local.go ≠ go) fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
21
Simple Barrier Using an Atomic Counter
Another Run for n=2 Processes
counter
P2
P1
local.go
?
0
local.counter
?
0
1
2
3
0
1
P1
SM
local.go
?
0
local.counter
?
1
P2
local.go := go
Counter
is “fetchP1: 0+1≠2
local.counter := fetch-and-increment (counter)
and-increment”
P2: 1+1=2
register
if local.counter + 1 = n then
4
counter := 0
5
go := 1 - go
6
Chapter 5
go
0
2
1
P1 Busy wait
else await(local.go ≠ go) fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
22
Another Algorithm Using an Atomic Counter
Program of a Process
shared
counter: fetch and increment reg. – {0,..n}, initially = 0
local
local.counter: register
1
local.counter := fetch-and-increment(counter)
2
if local.counter + 1 = n then
3
4
Chapter 5
Is this
implementation
incorrect?
counter := 0
else await(counter = 0) fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
23
Simple Barrier Using an Atomic Counter
There is high memory contention on
go bit
Reducing the contention:
go bit with n bits: go[1],…,go[n]
Process pi may spin only on the bit go[i]
Replace the
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
24
A Local Spinning Counter Barrier
Program of a Process i
shared
counter: fetch and increment reg. – {0,..n}, initially = 0
go[1..n]: array of atomic bits, initial values are immaterial
local
local.go: a bit, initial value is immaterial
local.counter: register
1
local.go := go[i]
2
local.counter := fetch-and-increment (counter)
3
if local.counter + 1 = n then
4
counter := 0
5
for j=1 to n do go[j] := 1 – go[j] od
6
Chapter 5
else await(local.go ≠ go[i]) fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
25
A Local Spinning Counter Barrier
Example Run for n=3 Processes
P3
P2
P1
counter
1
2
3
0
loc.go
?
0
loc.counter
0
?
P1
0
1
?
0
1
?
loc.go
?
0
loc.counter
?
1
P2
0
1
?
0
?
loc.counter
?
2
local.go := go[i]
2
local.counter := fetch-and-increment (counter)
2+1=3
0+1≠3
1+1≠3
3
if local.counter + 1 = n then
4
counter := 0
5
for j=1 to n do go[j] := 1 – go[j] od
else await(local.go ≠ go[i]) fi
SM
loc.go
1
6
Chapter 5
go
P3
P1,P2
P1 Busy
Busywait
wait
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
26
Comparison of fetch-and-increment Barriers
Simple Barrier
Simple Barrier with go array
Pros:
Pros:
Very Simple
Low contention on the go
Shared memory: O(log n)
bits
Takes O(1) until last
waiting p is awaken
Cons:
High contention on the
go bit
Contention on the
counter register (*)
Chapter 5
array
In some models:
spinning is done on local
memory
remote mem. ref.: O(1)
Cons:
Shared memory: O(n)
Still contention on the
counter register (*)
Takes O(n) until last
waiting p is awaken
(*) One technique for solving this contention is the
Synchronization Algorithms and Concurrent Programming
Combining
Tree Barriers – page 210
Gadi Taubenfeld © 2014
27
A Barrier without Memory Initialization
Barrier is a basic synchronization method
To initialize shared memory, processes need to be
synchronized
Thus, barrier may be a prerequisite for shared memory
initialization and cannot assume one
Processes may not be implemented in the same way
So it is desirable to reduce the dependency between them
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
28
A Barrier without Memory Initialization
Program of a Process
shared
counter: atomic counter – {0,..n-1}, initial value is immaterial
go: atomic bit, initial value is immaterial
local
local.go: a bit, initial value is immaterial
local.counter: register, initial value is immaterial
1
local.go := go
// remember current value
2
local.counter := counter
// remember current value
3
counter := counter +1 (mod n)
// atomic increment mod n
4
repeat
5
if counter = local.counter
// all processes have arrived
6
then go := 1 – local.go fi
// notify all
7
Chapter 5
until (local.go ≠ go)
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
29
Using Test-and-Set Bits
Section 5.3
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
30
Test-and-Set Bit
b
Test-and-set is an atomic operation:
b is set to 1
the old value of b (i.e., 0 or 1) is returned
Input: bit
function test-and-set (b : bit)
orig_b := b;
b:= 1;
return (orig_b);
end-function
An atomic
supported
Chapter 5
reset operation, which sets the value to 0, is
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
31
Test-and-Test-and-Set Bit
Operations supported:
Test-and-set
Reset
like a test-and-set bit
Atomic read (test)
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
32
Test-and-set based Barrier
shared
leader: test-and-set bit, initial value = 0
countflag: test-and-test-and-set bit, initial value = 0
go: atomic bit, initial value is immaterial
local
local.go: a bit, initial value is immaterial
local.counter: register, initial value is immaterial
Chapter 5
0
leader: test-and-set bit
0
countflag: test-and-test-set bit
Local.counter: register
go: atomic register
local.go: bit
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
33
Test-and-set based Barrier
1
local.go := go
2
if test-and-set(leader) = 0 then
3
local.counter := 0
4
repeat
5
await(countflag = 1)
6
local.counter = local.counter + 1
7
reset(countflag)
8
until (local.counter = n - 1)
9
reset(leader)
10
go := 1 – go
11
// a test operation
// the other processes
12
await(test-and-set(countflag) = 0)
13
await(local.go ≠ go)
14
Chapter 5
else
// the leader
fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
34
Test-and-Set Barrier
P1
P2
leader
P3
0
1
SM
P4
leader test-and-set atomic operation
First to set the
leader bit is the
leader
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
35
Test-and-Set Barrier
P1
P2
0
1
countflag
P3
SM
P4
go!
repeat
await (test-and-set atomic operation on countflag)
await (countflag = 1)
until
(local.counter = n - 1)
await ( go changed ? )
Chapter 5
P4 – the leader
All processes has
arrived, change go
0
3
2
1
local.counter
Synchronization
Algorithms
and
Concurrent
Programming
bit and exit barrier
Gadi Taubenfeld © 2014
36
A Barrier without Memory Initialization
Two new techniques
1.
The leader count each process twice
Needs only to count to 2n – 2
Allows off-by-one mistakes
Thus make memory initialization redundant
2.
Asymmetric-ness
Process has a role according to its index i
Pros:
saves bits and operations
Cons:
different processes differ in their tasks
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
37
Asymmetric Test-and-set based Barrier w/o M/I
program of process i
shared
countflag: test-and-test-and-set bit, initial value is immaterial
go: atomic bit, initial value is immaterial
local
local.go: a bit, initial value is immaterial
local.counter: atomic register, initial value is immaterial
No need for the
leader
test-and-set bit
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
38
Asymmetric Test-and-set based Barrier w/o M/I
program of process i
1
local.go := go
2
if i = 1 then
3
local.counter := 0
4
repeat
5
await(countflag = 1)
6
local.counter = local.counter + 1
7
reset(countflag)
8
until (local.counter = 2n - 2)
9
go := 1 – go
10
else
// a test operation
// the other processes
11
await(test-and-set(countflag) = 0)
12
await(test-and-set(countflag) = 0)
13
await(local.go ≠ go)
14
Chapter 5
// the leader
fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
39
Test-and-Set based Barriers
Properties
Different object (T&S instead of F&I)
Pros:
Shared memory: Only bits - O(1) space
As opposed to the counter-based which requires O(log n)
Does not require memory initialization (in the second version)
Cons:
Asymmetric (in the second version)
Still high contention on countflag & go bits
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
40
Tree Based Barriers
Section 5.5
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
41
A Tree-based Barrier
The processes are organized in a binary tree
Each node is owned by a predetermined process
Each process waits until its 2 children arrive, combines the
results and passes them on to its parent
When the root learns that its 2 children have arrived, it tells
its children that they can move on
The signal propagates down the tree until all the processes
get the message
1
2
4
Chapter 5
3
5
6
7
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
42
A Tree-based Barrier
1
Assume 𝑛
𝑖𝑘 −1
=2
2𝑖
2
2𝑖
+1
3
4
8
5
9
10
6
11
12
7
13
14
15
arrive
go
2
Chapter 5
3
4
5
6
7
8
9
10 11 12 13 14 15
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
43
A Tree-based Barrier
program of process i
shared
arrive[2..n]: array of atomic bits, initial values = 0
go[2..n]: array of atomic bits, initial values = 0
1
2
await(arrive[2] = 1); arrive[2] := 0
3
await(arrive[3] = 1); arrive[3] := 0
4
go[2] = 1; go[3] = 1
5
else if i ≤ (n-1)/2 then
6
await(arrive[2i] = 1); arrive[2i] := 0
7
await(arrive[2i+1] = 1); arrive[2i+1] := 0
8
arrive[i] := 1
9
await(go[i] = 1); go[i] := 0
10
go[2i] = 1; go[2i+1] := 1
11
else
// root
// internal node
// leaf
12
arrive[i] := 1
13
await(go[i] = 1); go[i] := 0 fi
14
Chapter 5
if i=1 then
fi
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
44
A Tree-based Barrier
Example Run for n=7 Processes
Waiting for
p3 to arrive
arrive[2]=1
PP132 zeros
zeros
Arrive[1]=1
Finished!!
arrive[6,7]
arrive[4,5]
arrive[2]
arrive[3]
?
1
Waiting for
go[3]
Waiting for
p4 to
go[2]
arrive
2
Waiting for
go[5]
go[4]
3
4
Chapter 5
5
6
Waiting for
go[7]
go[6]
7
arrive
01
01
1
0
1
0
01
01
go
1
1
1
1
1
1
2
3
4
5
6
7
45
A Tree-based Barrier
Pros:
Low shared memory contention
No bit is shared by more than 2 processes
Good for larger n
Fast (in comparison to local spinning)
– information from the root propagates after log(n) steps
Uses only atomic bits (no special objects)
On some models:
each process spins on a locally accessible bit
# (remote memory ref.) = O(1) per process
Cons:
Shared memory space complexity – O(n)
Asymmetric – not all the processes do the same amount
of work (*)
(*) There is a similar barrier which is symmetric, but at the cost of more
shared memory consumption -- O(nlogn) as opposed to O(n) .
See the Dissemination Barrier from Section 5.6 page 213.
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
46
The See-Saw Barrier
Section 5.7
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
47
See-Saw Barrier
Now, we’ll use a Read-Modify-Write object
Allows to construct a symmetric barrier, that requires only
few shared bits
This algorithm can also be used to solve the leader election
and the consensus problems
The See-Saw barrier is based on a solution to the wake-up
problem which was proposed by M. J. Fischer, S. Moran, S.
Rudich, G. Taubenfeld (1996)
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
48
Read-Modify-Write Register
Input: register
r with n bits, function f(r)
Atomic operation:
Reads the register
f on r, return value is written into r
The old value of r is returned
Calls function
function read-modify-write (r : register, f : function)
orig_r := r;
r := f(r);
return (orig_r);
end-function
Usually
Chapter 5
f is custom made for the algorithm
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
49
Data Flow
Tokens:
Each process starts with 2 tokens
Total number of tokens doesn’t change
Each process can absorb one token or emit one token,
at a time
See-Saw:
One see-saw
Can be left-up-right-down OR left-down-right-up
Each process that enters the playground needs to getup on the see-saw
Each process which is on the see-saw is either on the
left side or the right side
Tokens are weightless
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
50
Data Representation
Using 2-bit read-modify-write register
Token Bit
Two states:
1.
no-token-present
2.
token-present
See-saw Bit
Two states:
1.
left-side-down
2.
right-side-down
P2
T: 3
2
P1
T: 1
2
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
51
Process State
On right
side
On left
side
Never
been on
P7
P3
T:T:22
Got-off
P6P5
T:T:2P2
T:22
P1
T: 0
P4
T: 2
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
52
Runtime Flow
Each process loops until it got-off from the see-saw
After it got-off, waits for the go flag
The algorithm is based on 5-rules
On each loop iteration:
According to its state, one rule is performed
Only one process at a time performs a rule
A rule is done atomically, using the RMW register
Each rule changes the tokens and/or the state of the see-saw
There can be many processes on each side (up to
1
n
2
)
When one of the processes gets 2n tokens, it gets-off and
sets the go flag
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
53
Rule #1 – Start of
Algorithm
Token-state
No-token-present
See-saw-state
Right-side-down
Left-side-down
RMW
Applicable if:
scheduled process is “never-been-on”
Operation:
Saves the go bit locally
got on the up side, and swings the see-saw
P2
T: 2
Chapter 5
P1
T: 2
54
Rule #2 – Emitter
Token-state
No-token-present
Token-present
See-saw-state
Left-side-down
RMW
Applicable if:
scheduled process is “down-side”, has tokens,
and token-state = no-token-present
Operation:
Deposit one token in the shared token-state
If remains without tokens, got-off the see-saw, and swing
it
P1
T: 2
P2
T: 21
Chapter 5
55
Rule #3 – Absorber
Token-state
No-token-present
Token-present
See-saw-state
Left-side-down
RMW
Applicable if:
scheduled process is “up-side”, and
token-state = token-present
Operation:
Takes the token from token-state
P1
T: 23
P2
T: 1
Chapter 5
56
Rule #2 – Emitter
Token-state
No-token-present
Token-present
See-saw-state
Right-side-down
Left-side-down
RMW
Applicable if:
scheduled process is “down-side”, has tokens,
and token-state = no-token-present
Operation:
Deposit one token in the shared token-state
If remains without tokens, got-off the see-saw, and swing
!
it
The process that got-off now
awaits the go flag
P1
T: 3
P2
T: 0
1
Chapter 5
57
Rule #4 – Leader
Token-state
No-token-present
Token-present
See-saw-state
Right-side-down
RMW
Applicable if:
scheduled process is on the see-saw, and sees at
least 2n tokens
Operation:
Gets-off the see-saw, and flips the shared go bit
ZZZ…
P1
T: 3
P2
T: 0
Chapter 5
58
Rule #4 – Leader
Token-state
No-token-present
See-saw-state
Right-side-down
RMW
Applicable if:
scheduled process is on the see-saw, and sees at
least 2n tokens
Operation:
Gets-off the see-saw, and flips the shared go bit
go!
ZZZ…
P1
T: 4
P2
T: 0
Chapter 5
59
Rule #5 – End of
the Algorithm
Token-state
No-token-present
See-saw-state
Right-side-down
RMW
Applicable if:
scheduled process notices that the go bit has been
flipped (relative to its local.go)
Operation:
Everybody has arrived continue past the barrier
go!
ZZZ…
P2
T: 0
Chapter 5
P1
T: 4
60
Important Invariants
Token Invariant
During a single episode of the see-saw barrier,
the number of tokens in the system
is either 2n or 2n+1
never changes
(like in the test-and-set barrier)
Balance Invariant
During a single episode of the see-saw barrier,
the number of processes on the left and on the
right side of the see-saw is
either perfectly balanced
or favored the down-side by 1
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
61
Correctness
When all processes are on the see-saw:
Tokens are given from the down side, until one gets-off
By induction, at some point:
one process will see 2n tokens
So no deadlock.
2n tokens can only be accumulated if all processes have
arrived, so this is a barrier.
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
62
Remarks
All the logic is done inside the atomic
of the RMW register
Modify function
Needs to read and modify all the three bits atomically,
to prevent race-conditions
Before a process applies a rule, it first checks
whether the go bit has been flipped relative to its
local.go (regardless of its current state) !!!
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
63
Question
How many times does the state of the shared memory change
during one episode of the see-saw barrier?
O(n) in the best case
O(n2) in the worst case
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
64
The See-Saw Barrier
Pros:
O(1) shared memory space complexity
No need to initialize shared memory
Symmetric
Cons:
Uses custom Read-Modify-Write register
High memory contention on the RMW bits
Worst case O(n2) total shared memory
references
Complex
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
65
The code:
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
66
A Barrier using Semaphores
Section 5.8
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
67
Semaphore
Shared object
Takes a non-negative integer value
Supports two operations:
Down
If value > 0, the value is decremented by 1
Otherwise, the process is blocked until the value
becomes > 0
Up – the value is incremented by 1
Incrementing, Decrementing and testing the
semaphore are executed atomically
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
68
Binary Semaphore
Semaphore whose value is only 0 or 1
Decrementing is identical to general semaphore
Incrementing is equal to setting the value to 1
Initial value is assume to be 1
Can be used to implement a deadlock-free
mutual exclusion:
down(S)
critical-section
up(S)
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
69
Barrier using Semaphores
Algorithm for n processes
shared
arrival: binary semaphore, initially 1
departure: binary semaphore, initially 0
counter: atomic register ranges over {0, …, n}, initially 0
1
down(arrival)
2
counter := counter + 1
3
if counter < n then up(arrival) else up(departure) fi
4
down(departure)
5
counter := counter - 1
6
if counter > 0 then up(departure) else up(arrival) fi
// atomic register
Question:
Would this barrier be correct if the
shared counter won’t be an atomic register?
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
70
Barrier using Semaphores
Properties
Pros:
Very Simple
Space complexity O(1)
Symmetric
Cons:
Required a strong object
Requires some central manager
High contention on the semaphores if no central
manager
Propagation delay O(n)
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
71
Summary
Barrier Synchronization
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
72
Barriers we’ve seen
Simple barrier
Based on atomic fetch-and-increment counter
Local spinning barrier
Based on atomic fetch-and-increment counter
and go array
Test-and-Set barriers
Based on test-and-test-and-set objects
One version without memory initialization
Tree-based barrier
See-Saw barrier
Semaphore-based barrier
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
73
Conclusions
Many possible algorithms for Barrier Synchronization
Each has pros/cons
Different shared objects allow various algorithms
Choosing the correct barrier is application/platform
dependent (need to do benchmarking to know for sure).
Chapter 5
Synchronization Algorithms and Concurrent Programming
Gadi Taubenfeld © 2014
74