OMSE 510: Computing Foundations 6: Multithreading

Download Report

Transcript OMSE 510: Computing Foundations 6: Multithreading

OMSE 510: Computing
Foundations
6: Multithreading
Chris Gilmore <[email protected]>
Portland State University/OMSE
Material Borrowed from Jon Walpole’s lectures
1
Today
Threads
Critical Sections
Mutual Exclusion
2
Threads
Processes have the following components:
an address space
 a collection of operating system state
 a CPU context … or thread of control

On multiprocessor systems, with several CPUs,
it would make sense for a process to have
several CPU contexts (threads of control)
Multiple threads of control could run in the
same address space on a single CPU system
too!

“thread of control” and “address space” are
orthogonal concepts
3
Threads
Threads share a process address space with
zero or more other threads
Threads have their own
PC, SP, register state etc
 stack

What other OS state should be private to
threads?
A traditional process can be viewed as an
address space with a single thread
4
Single thread state
within a process
5
Multiple threads in an
address space
6
What is a thread?
A thread executes a stream of
instructions

an abstraction for control-flow
Practically, it is a processor context and
stack
Allocated a CPU by a scheduler
 Executes in the context of a memory
address space

7
Summary of private perthread state
Stack (local variables)
Stack pointer
Registers
Scheduling properties (i.e., priority)
Set of pending and blocked signals
Other thread specific data
8
Shared state among
threads
Open files, sockets, locks
User ID, group ID, process/task ID
Address space
Text
 Data (off-stack global variables)
 Heap (dynamic data)

Changes made to shared state by one thread
will be visible to the others

Reading and writing memory locations requires
9
synchronization! … a major topic for later …
Independent execution
of threads
Each thread has its own stack
10
How do you program
using threads?
Split program into routines to execute
in parallel

True or pseudo (interleaved) parallelism
11
Why program using
threads?
Utilize multiple CPU’s concurrently
Low cost communication via shared
memory
Overlap computation and blocking on a
single CPU
Blocking due to I/O
 Computation and communication

Handle asynchronous events
12
Thread usage
A word processor with three threads
13
Processes versus
threads - example
A WWW process
HTTPD
GET / HTTP/1.0
disk
14
Processes versus
threads - example
A WWW process
HTTPD
GET / HTTP/1.0
disk
Why is this not a good web server design?
15
Processes versus
threads - example
A WWW process
HTTPD
GET / HTTP/1.0
HTTPD
disk
16
Processes versus
threads - example
A WWW process
HTTPD
GET / HTTP/1.0
GET / HTTP/1.0
disk
17
Processes versus
threads - example
A WWW process
HTTPD
GET / HTTP/1.0
GET / HTTP/1.0
GET / HTTP/1.0
disk
GET / HTTP/1.0
18
Threads in a web server
A multithreaded web server
19
Thread usage
Rough outline of code for previous slide
(a) Dispatcher thread
(b) Worker thread
20
System structuring
options
Three ways to construct a server
21
Common thread
programming models
Manager/worker
 Manager thread handles I/O and assigns work to worker
threads
 Worker threads may be created dynamically, or allocated
from a thread-pool
Peer
 Like manager worker, but manager participates in the work
Pipeline
 Each thread handles a different stage of an assembly line
 Threads hand work off to each other in a producer-consumer
relationship
22
What does a typical thread
API look like?
POSIX standard threads (Pthreads)
First thread exists in main(), typically creates
the others
pthread_create (thread,attr,start_routine,arg)
Returns new thread ID in “thread”
 Executes routine specified by “start_routine” with
argument specified by “arg”
 Exits on return from routine or when told explicitly

23
Thread API (continued)
pthread_exit (status)

Terminates the thread and returns “status” to any
joining thread
pthread_join (threadid,status)
Blocks the calling thread until thread specified by
“threadid” terminates
 Return status from pthread_exit is passed in
“status”
 One way of synchronizing between threads

pthread_yield ()

24
Thread gives up the CPU and enters the run queue
Using create, join and
exit primitives
25
An example Pthreads program
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5
void *PrintHello(void *threadid)
{
printf("\n%d: Hello World!\n", threadid);
pthread_exit(NULL);
}
int main (int argc, char *argv[])
{
pthread_t threads[NUM_THREADS];
int rc, t;
for(t=0; t<NUM_THREADS; t++
{
printf("Creating thread %d\n", t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc)
{
printf("ERROR; return code from pthread_create() is %d\n",
rc);
exit(-1);
}
}
pthread_exit(NULL);
}
Program Output
Creating thread 0
Creating thread 1
0: Hello World!
1: Hello World!
Creating thread 2
Creating thread 3
2: Hello World!
3: Hello World!
Creating thread 4
4: Hello World!
For more examples see: http://www.llnl.gov/computing/tutorials/pthreads
26
Pros & cons of threads
Pros
Overlap I/O with computation!
 Cheaper context switches
 Better mapping to shared memory
multiprocessors

Cons
Potential thread interactions
 Complexity of debugging
 Complexity of multi-threaded programming
 Backwards compatibility with existing code

27
Making single-threaded
code multithreaded
Conflicts between threads over the use of a global
variable
28
Making single-threaded code
multithreaded
Threads can have private global variables
29
User-level threads
Threads can be implemented in the OS
or at user level
User level thread implementations
thread scheduler runs as user code
 manages thread contexts in user space
 OS sees only a traditional process

30
Kernel-level threads
The thread-switching code is in the kernel
31
User-level threads package
The thread-switching code is in user space
32
User-level threads
Advantages
cheap context switch costs!
 User-programmable scheduling policy

Disadvantages
How to deal with blocking system calls!
 How to overlap I/O and computation!

33
Hybrid thread
implementations
Multiplexing user-level threads onto
kernel- level threads
34
Scheduler activations
Goal – mimic functionality of kernel threads

gain performance of user space threads
The idea - kernel upcalls to user-level thread
scheduling code when it handles a blocking system
call or page fault


user level thread scheduler can choose to run a different
thread rather than blocking
kernel upcalls when system call or page fault returns
Kernel assigns virtual processors to each process
(which contains a user level thread scheduler)

lets user level thread scheduler allocate threads to
processors
Problem: relies on kernel (lower layer) calling
procedures in user space (higher layer)
35
Concurrent programming
Assumptions:
Two or more threads (or processes)
 Each executes in (pseudo) parallel and can’t predict
exact running speeds
 The threads can interact via access to a shared
variable

Example:
One thread writes a variable
 The other thread reads from the same variable

Problem:

The order of READs and WRITEs can make a
36
Race conditions
What is a race condition?

two or more processes have an inconsistent
view of a shared memory region (I.e., a variable)
Why do race conditions occur?



values of memory locations replicated in
registers during execution
context switches at arbitrary times during
execution
processes can see “stale” memory values in
registers
37
Counter increment race
condition
Incrementing a counter (load, increment, store)
Context switch can occur after load and before
increment!
38
Race Conditions
Race condition: whenever the output
depends on the precise execution order
of the processes!!!
What solutions can we apply?


prevent context switches by preventing
interrupts
make threads coordinate with each other to
ensure mutual exclusion in accessing critical
sections of code
39
Mutual exclusion
conditions
No two processes simultaneously in critical
section
No assumptions made about speeds or
numbers of CPUs
No process running outside its critical section
may block another process
No process must wait forever to enter its
critical section
40
Critical sections with
mutual exclusion
41
How can we enforce
mutual exclusion?
What about using a binary “lock” variable in
memory and having threads check it and set it
before entry to critical regions?
Solves the problem of exclusive access to
shared data.


Expresses intention to enter critical section
Acquiring a lock prevents concurrent access
Assumption:


Every threads sets lock before accessing shared data!
Every threads releases the lock after it is done!
42
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Thread D
Free
Lock
43
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Thread D
Free
Lock
44
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Thread D
Set
Lock
45
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Thread D
Set
Lock
46
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Thread D
Set
Lock
47
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Thread D
Set
Lock
48
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Thread D
Set
Lock
49
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Lock
Thread D
Lock
Set
Lock
50
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Lock
Thread D
Lock
Set
Lock
51
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Unlock
Lock
Thread D
Lock
Set
Lock
52
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Unlock
Lock
Thread D
Lock
Set
Lock
53
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Lock
Thread D
Lock
Free
Lock
54
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Lock
Thread D
Lock
Free
Lock
55
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Lock
Thread D
Lock
Set
Lock
56
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Lock
Thread D
Lock
Set
Lock
57
Acquiring and releasing
locks
Thread B
Thread C
Thread A
Lock
Thread D
Lock
Set
Lock
58
Mutex locks
An abstract data type
Used for synchronization and mutual
exclusion
The “mutex” is either:
Locked
 Unlocked

(“the lock is held”)
(“the lock is free”)
59
Mutex lock operations
Lock (mutex)
Acquire the lock, if it is free
 If the lock is not free, then wait until it can be
acquired

Unlock (mutex)
Release the lock
 If there are waiting threads, then wake up one of
them

Both Lock and Unlock are assumed to be
atomic!!!

A kernel implementation can ensure atomicity
60
An Example using a
Shared data:
Mutex
Mutex myLock;
1 repeat
1 repeat
2
Lock(myLock);
2
Lock(myLock);
3
critical section
3
critical section
4
Unlock(myLock);
4
Unlock(myLock);
5
remainder section
5
remainder section
6 until FALSE
6 until FALSE
61
But how can we implement a
mutex lock?
Does a binary “lock” variable in memory
work?
Many computers have some limited
hardware support for setting locks
“Atomic” Test and Set Lock instruction
 “Atomic” compare and swap operation

Can be used to implement “Mutex”
locks
62
Test-and-set-lock
instruction (TSL, tset)
A lock is a single word variable with two values


0 = FALSE = not locked
1 = TRUE = locked
Test-and-set does the following atomically:



Get the (old) value
Set the lock to TRUE
Return the old value
If the returned value was FALSE...
Then you got the lock!!!
If the returned value was TRUE...
Then someone else has the lock
63
Test and set lock
FALSE
Lock
64
Test and set lock
P1
FALSE
Lock
65
Test and set lock
FALSE = Lock Available!!
P1
FALSE
Lock
66
Test and set lock
P1
FALSE
TRUE
Lock
67
Test and set lock
P1
FALSE
TRUE
Lock
68
Test and set lock
P1
P2
TRUE
TRUE
P3
TRUE
TRUE
TRUE
Lock
TRUE
P4
TRUE
69
Test and set lock
P1
P2
TRUE
TRUE
P3
TRUE
TRUE
TRUE
Lock
TRUE
P4
TRUE
70
Test and set lock
P1
P2
TRUE
TRUE
P3
TRUE
TRUE
TRUE
Lock
TRUE
P4
TRUE
71
Test and set lock
P1
P2
TRUE
TRUE
P3
TRUE
TRUE
FALSE
Lock
TRUE
P4
TRUE
72
Test and set lock
P1
P2
TRUE
P3
FALSE
FALSE
P4
Lock
73
Test and set lock
P1
P2
TRUE
P3
FALSE
FALSE
P4
Lock
74
Test and set lock
P1
P2
TRUE
P3
TRUE
TRUE
Lock
TRUE
P4
TRUE
75
Critical section entry
code with TSL
1 repeat
2 while(TSL(lock))
3
no-op;
I
1 repeat
2 while(TSL(lock))
3
no-op;
4
critical section
4
critical section
5
Lock = FALSE;
5
Lock = FALSE;
6
remainder section
6
remainder section
7 until FALSE
J
7 until FALSE
Guarantees that only one thread at a
time will enter its critical section
Note that processes are busy while
waiting

Spin locks
76
Busy waiting
Also called polling or spinning

The thread consumes CPU cycles to
evaluate when lock becomes free!!!
Shortcoming on a single CPU system...
A busy-waiting thread can prevent the lock
holder from running & completing its critical
section & releasing the lock!
 Better: Block instead of busy wait!

77
Quiz
What is the difference between a
program and a process?
Is the Operating System a program?
Is the Operating System a process?
What is the difference between
processes and threads?
What tasks are involved in switching the
CPU from one process to another?

Why is it called a context switch?
What tasks are involved in switching the
CPU from one thread to another?
78
Synchronization primitives
Sleep


Put a thread to sleep
Thread becomes BLOCKed
Wakeup
Move a BLOCKed thread back onto “Ready
List”
 Thread becomes READY (or RUNNING)

Yield



Move to another thread
Does not BLOCK thread
Just gives up the current time-slice
79
But how can these be
implemented?
In User Programs:

System calls to the kernel
In Kernel:

Calls to the thread scheduler routines
80
Concurrency control in
the kernel
Different threads call Yield, Sleep, ...
Scheduler routines manipulate the “ready
list”
The ready list is shared data !
Problem:

How can scheduler routines be programmed
correctly?
Solution:
Scheduler can disable interrupts, or
 Scheduler can use the TSL instruction

81
Concurrency in the kernel
The kernel can avoid performing context switches while
manipulating the ready list
 prevents concurrent execution of system call code
 … but what about interrupts?
 … what if interrupt handlers touch the ready list?
Disabling interrupts during critical sections
 Ensures that interrupt handling code will not run
Using TSL for critical sections
 Ensures mutual exclusion for all code that follows
that convention
82
Disabling interrupts
Disabling interrupts in the OS vs
disabling interrupts in user processes
why not allow user processes to disable
interrupts?
 is it ok to disable interrupts in the OS?
 what precautions should you take?

83
Disabling interrupts in the
kernel
Scenario 1:
A thread is running; wants to access
shared data
Disable interrupts
Access shared data (“critical section”)
Enable interrupts
84
Disabling interrupts in the
kernel
Scenario 2:
Interrupts are already disabled and a
second thread wants to access the
critical section
...using the above sequence...
85
Disabling interrupts in the
kernel
Scenario 2: Interrupts are already
disabled.
 Thread
wants to access critical section
using the previous sequence...
Save previous interrupt status
(enabled/disabled)
Disable interrupts
Access shared data (“critical section”)
Restore interrupt status to what it was before
86
Classical Synchronization
Problems
Producer-Consumer
One thread produces data items
 Another thread consumes them
 Use a bounded buffer / queue between the threads
 The buffer is a shared resource


Must control access to it!!!
Must suspend the producer thread if buffer is full
 Must suspend the consumer thread if buffer is
empty

87
Producer/Consumer with Busy Waiting
thread producer {
while(1){
// Produce char c
while (count==n) {
no_op
}
buf[InP] = c
InP = InP + 1 mod n
count++
}
}
n-1
0
1
…
thread consumer {
while(1){
while (count==0) {
no_op
}
c = buf[OutP]
OutP = OutP + 1 mod n
count-// Consume char
}
}
Global variables:
char buf[n]
int InP = 0
// place to add
int OutP = 0 // place to get
int count
2
88
Problems with this code
Count variable can be corrupted if
context switch occurs at the wrong time
A race condition exists!
 Race bugs very difficult to track down

What if buffer is full?
Produce will busy-wait
 Consumer will not be able to empty the
buffer

What if buffer is empty?
Consumer will busy-wait
 Producer will not be able to fill the buffer

89
Producer/Consumer with
Blocking
0 thread producer {
1
while(1) {
2
// Produce char c
3
if (count==n) {
4
sleep(full)
5
}
6
buf[InP] = c;
7
InP = InP + 1 mod n
8
count++
9
if (count == 1)
10
wakeup(empty)
11
}
12 }
n-1
0
1
…
2
0 thread consumer {
1
while(1) {
2
while (count==0) {
3
sleep(empty)
4
}
5
c = buf[OutP]
6
OutP = OutP + 1 mod n
7
count--;
8
if (count == n-1)
9
wakeup(full)
10
// Consume char
11 }
12 }
Global variables:
char buf[n]
int InP = 0
// place to add
int OutP = 0 // place to get
int count
90
This code is still
incorrect!
The “count” variable can be corrupted:
Increments or decrements may be lost!
 Possible Consequences:

Both threads may sleep forever
 Buffer contents may be over-written

What is this problem called?
91
This code is still
incorrect!
The “count” variable can be corrupted:
Increments or decrements may be lost!
 Possible Consequences:

Both threads may sleep forever
 Buffer contents may be over-written

What is this problem called? Race
Condition
Code that manipulates count must be
made into a ??? and protected using
???
92
This code is still
incorrect!
The “count” variable can be corrupted:
Increments or decrements may be lost!
 Possible Consequences:

Both threads may sleep forever
 Buffer contents may be over-written

What is this problem called? Race
Condition
Code that manipulates count must be
made into a critical section and
protected using mutual exclusion!
93
Semaphores
An abstract data type that can be used
for condition synchronization and
mutual exclusion
What is the difference between mutual
exclusion and condition
synchronization?
94
Semaphores
An abstract data type that can be used
for condition synchronization and
mutual exclusion
Condition synchronization
wait until invariant holds before proceeding
 signal when invariant holds so others may
proceed

Mutual exclusion

only one at a time in a critical section
95
Semaphores
An abstract data type
containing an integer variable (S)
 Two operations: Down (S) and Up (S)

Alternative names for the two
operations
Down(S)
= Wait(S) = P(S)
 Up(S) = Signal(S) = V(S)

96
Semaphores
Down (S) … also called called “Wait”
decrement S by 1

if S would go negative, wait/sleep until signaled
Up (S) … also called “Signal”
increment S by 1
 signal/wakeup a waiting thread

S will always be >= 0.
Both Up () and Down () are assumed to be atomic!!!
 A kernel implementation must ensure atomicity

97
Variation: Binary
Semaphores
Counting Semaphores

same as just “semaphore”
Binary Semaphores
a specialized use of semaphores
 the semaphore is used to implement a
Mutex Lock

98
Variation: Binary
Semaphores
Counting Semaphores

same as just “semaphore”
Binary Semaphores
a specialized use of semaphores
 the semaphore is used to implement a
Mutex Lock
 the count will always be either
0 = locked
1 = unlocked

99
Using Semaphores for
Mutex
semaphore mutex = 1
-- unlocked
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
100
Using Semaphores for
Mutex
semaphore mutex = 0
-- locked
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
101
Using Semaphores for
Mutex
semaphore mutex = 0
--locked
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
102
Using Semaphores for
Mutex
semaphore mutex = 0
-- locked
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
103
Using Semaphores for
Mutex
semaphore mutex = 0
-- locked
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
104
Using Semaphores for
Mutex
semaphore mutex = 1
-- unlocked
This thread can
now be released!
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
105
Using Semaphores for
Mutex
semaphore mutex = 0
-- locked
1 repeat
1 repeat
2
down(mutex);
2
down(mutex);
3
critical section
3
critical section
4
up(mutex);
4
up(mutex);
5
remainder section
5
remainder section
6 until FALSE
Thread A
6 until FALSE
Thread B
106
Exercise: Implement producer/consumer
Global variables
semaphore full_buffs = ?;
semaphore empty_buffs = ?;
char buff[n];
int InP, OutP;
0 thread producer {
1
while(1){
2
// Produce char c...
3
buf[InP] = c
4
InP = InP + 1 mod n
5
}
6 }
0 thread consumer {
1
while(1){
2
c = buf[OutP]
3
OutP = OutP + 1 mod n
4
// Consume char...
5
}
6 }
107
Counting semaphores in
producer/consumer
Global variables
semaphore full_buffs = 0;
semaphore empty_buffs = n;
char buff[n];
int InP, OutP;
0 thread producer {
1
while(1){
2
// Produce char c...
3
down(empty_buffs)
4
buf[InP] = c
5
InP = InP + 1 mod n
6
up(full_buffs)
7
}
8 }
0 thread consumer {
1
while(1){
2
down(full_buffs)
3
c = buf[OutP]
4
OutP = OutP + 1 mod n
5
up(empty_buffs)
6
// Consume char...
7
}
8 }
108
Implementing semaphores
Up() and Down() are assumed to be atomic
How can we ensure that they are atomic?
Implement Up() and Down() as system calls?
how can the kernel ensure Up() and Down() are
completed atomically?
 avoid scheduling another thread when they are in
progress?
 … but how exactly would you do that?
 … and what about semaphores for use in the
kernel?

109
Semaphores with
interrupt disabling
struct semaphore {
int val;
list L;
}
Down(semaphore sem)
DISABLE_INTS
sem.val-if (sem.val < 0){
add thread to sem.L
block(thread)
}
ENABLE_INTS
Up(semaphore sem)
DISABLE_INTS
sem.val++
if (sem.val <= 0) {
th = remove next
thread from sem.L
wakeup(th)
}
ENABLE_INTS
110
Semaphores with
interrupt disabling
struct semaphore {
int val;
list L;
}
Down(semaphore sem)
DISABLE_INTS
sem.val-if (sem.val < 0){
add thread to sem.L
block(thread)
}
ENABLE_INTS
Up(semaphore sem)
DISABLE_INTS
sem.val++
if (sem.val <= 0) {
th = remove next
thread from sem.L
wakeup(th)
}
ENABLE_INTS
111
But what are block() and
wakeup()?
If block stops a thread from executing, how,
where, and when does it return?




which thread enables interrupts following Down()?
the thread that called block() shouldn’t return until
another thread has called wakeup() !
… but how does that other thread get to run?
… where exactly does the thread switch occur?
Scheduler routines such as block() contain
calls to switch() which is called in one thread
but returns in a different one!!
112
Semaphores using atomic instructions
As we saw earlier, hardware provides
special atomic instructions for
synchronization
test and set lock (TSL)
 compare and swap (CAS)
 etc

Semaphore can be built using atomic
instructions
1. build mutex locks from atomic instructions
2. build semaphores from mutex locks
113
Building blocking mutex
locks using TSL
Mutex_lock:
TSL REGISTER,MUTEX
CMP REGISTER,#0
JZE ok
CALL thread_yield
JMP mutex_lock
Ok: RET
| copy mutex to register and set mutex to 1
| was mutex zero?
| if it was zero, mutex is unlocked, so return
| mutex is busy, so schedule another thread
| try again later
| return to caller; enter critical section
Mutex_unlock:
MOVE MUTEX,#0
RET
| store a 0 in mutex
| return to caller
114
Building spinning mutex
locks using TSL
Mutex_lock:
TSL REGISTER,MUTEX
CMP REGISTER,#0
JZE ok
CALL thread_yield
JMP mutex_lock
Ok: RET
| copy mutex to register and set mutex to 1
| was mutex zero?
| if it was zero, mutex is unlocked, so return
| mutex is busy, so schedule another thread
| try again later
| return to caller; enter critical section
Mutex_unlock:
MOVE MUTEX,#0
RET
| store a 0 in mutex
| return to caller
115
To block or not to block?
Spin-locks do busy waiting
wastes CPU cycles on uni-processors
 Why?

Blocking locks put the thread to sleep
may waste CPU cycles on multiprocessors
 Why?

116
Building semaphores using
mutex locks
Problem: Implement a counting semaphore
Up ()
Down ()
...using just Mutex locks
117
How about two “blocking”
mutex locks?
var cnt: int = 0
-- Signal count
var m1: Mutex = unlocked -- Protects access to “cnt”
m2: Mutex = locked
-- Locked when waiting
Down ():
Lock(m1)
cnt = cnt – 1
if cnt<0
Unlock(m1)
Lock(m2)
else
Unlock(m1)
endIf
Up():
Lock(m1)
cnt = cnt + 1
if cnt<=0
Unlock(m2)
endIf
Unlock(m1)
118
How about two
“blocking” mutex locks?
var cnt: int = 0
-- Signal count
var m1: Mutex = unlocked -- Protects access to “cnt”
m2: Mutex = locked
-- Locked when waiting
Down ():
Lock(m1)
cnt = cnt – 1
if cnt<0
Unlock(m1)
Lock(m2)
else
Unlock(m1)
endIf
Up():
Lock(m1)
cnt = cnt + 1
if cnt<=0
Unlock(m2)
endIf
Unlock(m1)
119
Oops! How about this
then?
var cnt: int = 0
-- Signal count
var m1: Mutex = unlocked -- Protects access to “cnt”
m2: Mutex = locked
-- Locked when waiting
Down ():
Lock(m1)
cnt = cnt – 1
if cnt<0
Lock(m2)
Unlock(m1)
else
Unlock(m1)
endIf
Up():
Lock(m1)
cnt = cnt + 1
if cnt<=0
Unlock(m2)
endIf
Unlock(m1)
120
Oops! How about this
then?
var cnt: int = 0
-- Signal count
var m1: Mutex = unlocked -- Protects access to “cnt”
m2: Mutex = locked
-- Locked when waiting
Down ():
Lock(m1)
cnt = cnt – 1
if cnt<0
Lock(m2)
Unlock(m1)
else
Unlock(m1)
endIf
Up():
Lock(m1)
cnt = cnt + 1
if cnt<=0
Unlock(m2)
endIf
Unlock(m1)
121
Ok! Lets have another
try!
var cnt: int = 0
-- Signal count
var m1: Mutex = unlocked -- Protects access to “cnt”
m2: Mutex = locked
-- Locked when waiting
Down ():
Up():
Lock(m2)
Lock(m1)
cnt = cnt – 1
if cnt>0
Unlock(m2)
endIf
Unlock(m1)
Lock(m1)
cnt = cnt + 1
if cnt=1
Unlock(m2)
endIf
Unlock(m1)
… is this solution valid?
122
What about this solution?
Mutex m1, m2;
int C = N;
int W = 0;
// binary semaphores
// N is # locks
// W is # wakeups
Down():
Lock(m1);
C = C – 1;
if (C<0)
Unlock(m1);
Lock(m2);
Lock(m1);
W = W – 1;
if (W>0)
Unlock(m2);
endif;
else
Unlock(m1);
endif;
Up():
Lock(m1);
C = C + 1;
if (C<=0)
W = W + 1;
Unlock(m2);
endif;
Unlock(m1);
123
Implementation possibilities
Implement Mutex Locks
... using Semaphores
Implement Counting Semaphores
... using Binary Semaphores
... using Mutex Locks
Can also implement using
Test-And-Set
Calls to Sleep, Wake-Up
Implement Binary Semaphores
... etc
124