EMC Business Continuance

Download Report

Transcript EMC Business Continuance

Chapter 2: Process/Thread
Instructor: Hengming Zou, Ph.D.
In Pursuit of Absolute Simplicity 求于至简,归于永恒
1
Content
 Processes
 Threads
 Inter-process communication
 Classical IPC problems
 Scheduling
2
Definition of A Process
 Informal
–A program in execution
–A running piece of code along with all the things the
program can read/write
 Formal
–One or more threads in their own address space
 Note that process != program
3
The Need for Process
 What is the principle motivation for inventing
process?
–To support multiprogramming
4
The Process Model
 Conceptual viewing of the processes
 Concurrency
–Multiple processes seem to run concurrently
–But in reality only one active at any instant
 Progress
–Every process makes progress
5
Multiprogramming of 4 programs
Programs in memory
Conceptual view
Time line view
6
Process Creation
Principal events that cause process creation
1. System initialization
2. Execution of a process creation system
3. User request to create a new process
7
Process Termination
Conditions which terminate processes
1.
Normal exit (voluntary)
2.
Error exit (voluntary)
3.
Fatal error (involuntary)
4.
Killed by another process (involuntary)
8
Process Hierarchies
 Parent creates a child process
 Child processes can create its own process
 Processes creation forms a hierarchy
–UNIX calls this a "process group"
–Windows has no concept of such hierarchy, i.e. all
processes are created equal
9
Process States
 Possible process states
–running, blocked, ready
Running
Process blocks
for input
Scheduler picks
this process
Scheduler picks
another process
Blocked
Ready
Input becomes available
10
Process Space
 Also called Address Space
 All the data the process uses as it runs
 Passive (acted upon by the process)
 Play analogy:
–all the objects on the stage in a play
11
Process Space
 Is the unit of state partitioning
–Each process occupies a different state of the
computer
 Main topic:
–How multiple processes spaces can share a single
physical memory efficiently and safely
12
Manage Process & Space
 Who manages process & space?
–The operating systems
 How does OS achieve it?
–By maintain information about processes
–i.e. use process tables
13
Fields of A Process Table
 Registers, Program counter, Status word
 Stack pointer, Priority, Process ID
 Parent group, Process group, Signals
 Time when process started
 CPU time used, Children’s CPU time, etc.
14
Problems with Process
 While supporting multiprogramming on shared hardware
 Itself is single threaded!
–i.e. a process can do only one thing at a time
–blocking call renders entire process unrunnable
15
Threads
 Invented to support multiprogramming on process level
 Manage OS complexity
–Multiple users, programs, I/O devices, etc.
 Each thread dedicated to do one task
16
Thread
 Sequence of executing instructions from a program
–i.e. the running computation
 Play analogy: one actor on stage in a play
17
Threads
 Processes decompose mix of activities into several
parallel tasks (columns)
 Each job can work independently of others
job1
job2
job3
Thread 1
Thread 2
Thread 3
18
The Thread Model
Proc 1
Proc 2
Proc 3
Process
User
space
Kernel
space
Thread
Thread
Kernel
Kernel
3 processes each with 1
thread
1 process with 3 threads
19
The Thread Model
 Some items shared by all threads in a process
 Some items private to each thread
20
Shared and Private Items
Per process items
Per thread items
Address space
Program counter
Global variables
Registers
Open files
Stack
Child processes
State
Pending alarms
Signals and signal handlers
Accounting information
21
Shared and Private Items
Each thread has its own stack
22
A Word Processor w/3 Threads
Display thread
Backup thread
Input thread
23
Implementation of Thread
 How many options to implement thread?
 Implement in kernel space
 Implement in user space
 Hybrid implementation
24
Kernel-Level Implementation
 Completely implemented in kernel space
 OS acts as a scheduler
 OS maintains information about threads
–In additions to processes
25
Kernel-Level Implementation
26
Kernel-Level Implementation
 Advantage:
–Easier to programming
–Blocking threads does not block process
 Problems:
–Costly: need to trap into OS to switch threads
–OS space is limited (for maintaining info)
–Need to modifying OS!
27
User-Level Implementation
 Completely implemented in user space
 A run-time system acts as a scheduler
 Threads voluntarily cooperate
–i.e. yield control to other threads
 OS need not know existence of threads
28
User-Level Implementation
29
User-Level Implementation
 Advantage:
–Flexible, can be implemented on any OS
–Faster: no need to trap into OS
 Problems:
–Programming is tricky
–blocking threads block process!
30
User-Level Implementation
 How do we solve the problem of blocking thread blocks
the whole process?
 Modifying system calls to be unblocking
 Write a wrap around blocking calls
–i.e. call only when it is safe to do so
31
Scheduler Activations
 A technique that solves the problem of blocking calls
in user-level threads
 Method: use up-call
 Goal – mimic functionality of kernel threads
–gain performance of user space threads
32
Scheduler Activations
 Kernel assigns virtual processors to process
 Runtime sys. allocate threads to processors
 Blocking threads are handled by OS upcall
–i.e. OS notify runtime system about blocking calls
33
Scheduler Activations
 Problem:
 Reliance on kernel (lower layer) calling procedures
in user space (higher layer)
 Violates layered structure of OS design
34
Hybrid Implementation
 Can we have the best of both worlds
–i.e. kernel-level and user-level implementations
 While avoiding the problems of either?
 Hybrid implementation
35
Hybrid Implementation
 User-level threads are managed by runtime systems
 Kernel-level threads are managed by OS
 Multiplexing user-level threads onto kernel- level
threads
36
Hybrid Implementation
37
Multiple Threads
 Can have several threads in a single address space
–That is what thread is invented for
 Play analogy: several actors on a single set
–Sometimes interact (e.g. dance together)
–Sometimes do independent tasks
38
Multiple Threads
 Private state for a thread vs. global state shared
between threads
–What private state must a thread have?
–Other state is shared between all threads in a process
39
Multiple Threads
Per process items
Per thread items
Address space
Program counter
Global variables
Registers
Open files
Stack
Child processes
State
Pending alarms
Signals and signal handlers
Accounting information
40
Multiple Threads
 Many programs are written in single-threaded
processes
 Make them multithreaded is very tricky
–Can cause unexpected problems
41
Multiple Threads
Conflicts between threads over the use of a global variable
42
Multiple Threads
 Many solutions:
 Prohibit global variables
 Assign each thread its private global variables
43
Multiple Threads
Threads can have
private global variables
44
Cooperating Threads
 Often we create threads to cooperate
 Each thread handles one request
 Each thread can issue a blocking disk I/O,
–wait for I/O to finish
–then continue with next part of its request
45
Cooperating Threads
 Ordering of events from different threads is non-
deterministic
–e.g. after 10 seconds, different threads may have
gotten differing amounts of work done
 thread A --------------------------------->
 thread B  thread C -
-
-
-
-
-
-
-
-
>
-
-
>
48
Cooperating Threads
 Example
–thread A: x=1
–thread B: x=2
 Possible results?
 Is 3 a possible output?
–yes
49
Cooperating Threads
 3 is possible because assignment is not atomic and
threads can interleave
 if assignment to x is atomic
–then only possible results are 1 and 2
50
Atomic Operations
 Atomic: indivisible
–Either happens in its entirety without interruption or
has yet to happen at all
 No events from other threads can happen in between
start & end of an atomic event
51
Atomic Operations
 On most machines, memory load & store are atomic
 But many instructions are not atomic
–double-precision floating on a 32-bit machine
–two separate memory operations
52
Atomic Operations
 If you don’t have any atomic operations
–you can’t make one
 Fortunately, the hardware folks give us atomic
operations
–and we can build up higher-level atomic primitives
from there
53
Atomic Operations
 Another example:
–thread A
–i=0
i=0
–while (i < 10) {
{
–
thread B
i++
–}
–print “A finished”
while (i > -10)
i-}
print “B finished”
 Who will win?
54
Atomic Operations
 Is it guaranteed that someone will win?
 What if threads run at exactly the same speed and
start close together?
 Is it guaranteed that it goes on forever?
55
Atomic Operations
 Arithmetic example
–(initially y=10)
–thread A: x = y + 1;
–thread B: y = y * 2;
 Possible results?
56
Thread Synchronization
 Must control the inter-leavings between threads
–Or the results can be non-deterministic
 All possible inter-leavings must yield a correct
answer
57
Thread Synchronization
 Try to constrain the thread executions as little as
possible
 Controlling the execution and order of threads is
called “synchronization”
58
Gold Fish Problem
 Problem definition:
 Tracy and Peter want to keep a gold fish alive
–By feeding the fish properly
 if either sees that the fish is not fed,
–she/he goes to feed the fish
 The fish must be fed once and only once each day
59
Correctness Properties
 Someone will feed the fish if needed
 But never more than one person feed the fish
60
Solution #0
 No synchronization
–Peter:
Tracy:
–if (noFeed) {
–
–}
feed fish
if (noFeed) {
feed fish
}
61
Problem with Solution #0
Peter
3:00
Tracy
look at the fish (no feed)
3:05
3:10
3:25
look at the fish (no feed)
feed fish
feed fish
Fish over eat!
62
Problem with Solution #0
 Over eat occurred because:
–They execute the same code at the same time
–i.e. if (noFeed) feed fish
 This is called race
–Two or more threads try to access the same shared
resource at the same time
63
Critical Section
 The shared resource or code is called a critical
section
–the region where race condition occurs
 Access to it must be coordinated
–i.e. only once process can access it at a time
64
Critical Section
 In solution #0,
critical section is
– “if (noFeed) feed fish”
 Peter and Tracy must NOT be inside the critical
section at the same time
65
1st Type of Synchronization
 Ensure only one person in critical section
–i.e. only 1 person goes shopping at a time
 This is called mutual exclusion:
–only 1 thread is doing a certain thing at one time
–others are excluded
66
Gold Fish Solution #1
 Idea:
–leave note that you’re going to check on the fish
status, so other person doesn’t also feed
73
Solution #1
Peter:
Tracy:
if (noNote) {
if (noNote) {
leave note
leave note
if (noFeed) {
if (noFeed) {
feed fish
}
feed fish
}
}
remove note
remove note
}
74
Solution #1
 Does this work?
 If not, when could it fail?
 Is solution #1 better than solution #0?
75
Solution #2
 Idea: change the order of “leave note” and “check
note”
 This requires labeled notes
–otherwise you’ll see your own note
–and think it was the other person’s note
76
Solution #2
Peter:
Tracy:
leave notePeter
if (no noteTracy) {
leave noteTracy
if (no notePeter) {
if (noFeed) {
if (noFeed) {
feed fish
feed fish
}
}
}
}
remove notePeter
remove noteTracy
77
Solution #2
 Does this work?
–Yes it solves the over eat problem
 If not, when could it fail?
–But it introduces starvation problem
78
Solution #3
 Idea:
–have a way to decide who will feed fish when both
leave notes at the same time.
 Approach:
–Have Peter hang around to make sure job is done
79
Solution #3
Peter:
Tracy:
leave notePeter
leave noteTracy
while (noteTracy) {
do nothing
}
if (no notePeter)
{
if (noFeed) {
{
if (noFeed)
feed fish
feed fish
}
}
}
remove notePeter
remove noteTracy
80
Solution #3
 Peter’s “while (noteTracy)” prevents him from running
his critical section at same time as Tracy’s
 It also prevents Peter from running off without make
sure that someone feeds fish
81
Solution #3
 Correct, but ugly
–Complicated (non-intuitive) to prove correct
 Asymmetric
–peter and Tracy runs different code
–This makes coding difficult
85
Solution #3
 Wasteful
–Peter consumes CPU time while waiting for Tracy to
remove note
 Constantly checking some status while waiting on
something is called busy-waiting
–Very very bad
86
Higher-Level Synchronization
 What is the solution?
–Raise the level of abstraction
 Make life easier for programmers
87
Higher-Level synchronization
Concurrent programs
High level synchronization operations provided
by software (i.e. semaphore, lock, monitor)
Low-level atomic operations provide by hardware
(i.e. load/store, interrupt enable/disable, test & set)
88
Lock (Mutex)
 A Lock prevents another thread from entering a
critical section
–e.g. lock the fridge while you’re shopping so that
both Peter and Tracy don’t go shopping
89
Lock (Mutex)
 Two operations
– lock(): wait until lock is free, then acquire it
– do {
–
if (lock is free) {
–
acquire lock; break
–
}
– } while (1)
 unlock(): release lock
90
Lock (Mutex)
 Why was the “note” in Gold Fishsolutions #1 and #2
not a good lock?
 Does it meet the 4 conditions?
91
Four Elements of Locking
 Lock is initialized to be free
 Acquire lock before entering critical section
 Release lock when exiting critical section
 Wait to acquire lock if another thread already holds
it
92
Solve “Too much fish” with lock
Peter:
Tracy:
lock()
lock()
if (noFeed) {
if (noFeed) {
feed fish
feed fish
}
}
unlock()
unlock()
93
Solve “Too much fish” with lock
 But this prevents Tracy from doing stuff while Peter
is shopping.
–I.e. critical section includes the shopping time.
 How to minimize the time the lock is held?
94
Producer-Consumer Problem
 Problem:
–producer puts things into a shared buffer
–Consumer takes them out
–Need synchronization for coordinating producer and
consumer
producer
consumer
95
Producer-Consumer Problem
 Buffer between producer and consumer allows them to
operate somewhat independently
96
Producer-Consumer Problem
 Otherwise must operate in lockstep
–producer puts 1 thing in buffer, then consumer takes
it out
–then producer adds another, then consumer takes it out,
etc
97
Producer-Consumer Problem
 Coke machine example
–delivery person (producer) fills machine with cokes
–students (consumer) feed cokes and drink them
–coke machine has finite space (buffer)
98
Solution for PC Problem
 What does producer do when buffer full?
 What does consumer do when buffer empty?
 The busy waiting solution in Solution 3 is
unacceptable
99
Solution for PC Problem Use
 Use Sleep and Wakeup primitives
 Consumer sleeps if buffer is empty
–Wake up sleeping producers otherwise
 Producer sleeps if buffer is full
–Wake up sleeping consumers otherwise
100
Sleep and Wakeup
#define N 100
/*# of slots in the buffer*/
Int count=0;
/*# of items in the buffer*/
Void producer(void)
{
Int item;
While(TRUE) {
Item=produce_item();
If(count==N) sleep();
Insert_item(item);
Count=count+1;
If(count==1) wakeup(consumer);
}
}
101
Sleep and Wakeup
Void consumer(void)
{ Int item;
While(TRUE) {
If(count==0) sleep();
Item=remove_item();
Count=count-1;
If(count==N-1) wakeup(producer);
Consume_item(item);
}
}
102
What are the problem?
 Producer-consumer problem with fatal race condition
–Access to “count” is unrestrainted
–Wakeup call could get lost
 Solution is to use semaphore
103
Semaphore
 Semaphores are like a generalized lock
 It has a non-negative integer value
 Semaphore supports the following ops:
–down, up
104
Down Operation
 Wait for semaphore to become positive
 Then decrement semaphore by 1
–Originally called “P” operation for the Dutch proberen
105
Up Operation
 Increment semaphore by 1
–originally called “V”, for the Dutch verhogen
 This wakes up a thread waiting in down
–if there are any
106
Semaphores
 Can also set the initial value of semaphore
 The key parts in down() and up() are atomic
 Two down() calls at the same time can’t decrement the
value below 0
107
Binary Semaphores
 Value is either 0 or 1
 down() waits for value to become 1
–then sets it to 0
 up() sets value to 1
–waking up waiting down (if any)
108
Mutual Exclusion w/ Semaphore
 Initial value is 1 (or more generally, N)
–down()
–<critical section>
–up()
 Like lock/unlock, but more general
109
Semaphores for Ordering
 Usually (not always) initial value is 0
 Example: thread A wants to wait for thread B to
finish before continuing
–Semaphore initialized to 0
–A
B
–down()
do task
–continue execution
up()
110
PC Problem with Semaphores
 Let’s solve the producer-consumer problem in
semaphores
111
Semaphore Assignments
 mutex:
–ensures mutual exclusion around code that manipulates
buffer queue (initialized to 1)
 full:
–counts the # of full buffers (initialized to 0)
112
Semaphore Assignments
 empty:
–counts the number of empty buffers (initialized to N)
 Why do we need different semaphores for full and
empty buffers?
113
Solve PC with Semaphores
#define N 100
/*# of slots in the buffer*/
typedef int semaphore; /*a special kind of int*/
semaphore mutex =1;
/*controls access to critical */
semaphore empty =N;
/*counts empty buffer slots*/
int full = 0;
/*counts full buffer slots */
114
Solve PC with Semaphores
void producer(void)
{ int item;
while(TRUE) {
item=produce_item();
down(&empty);
down(&mutex);
insert_item(item);
up(&mutex)
up(&full);
}
}
115
Solve PC with Semaphores
void consumer(void)
{ int item;
while(TRUE) {
down(&full);
down(&mutex);
item=remove_item();
up(&mutex);
up*(&empty);
consume_item(item);
}
}
116
Semaphore Assignments
 Does the order of the down() function calls matter in
the consumer (or the producer)?
 Does the order of the up() function calls matter in
the consumer (or the producer)?
117
Solve PC with Semaphores
 What (if anything) must change to allow multiple
producers and/or multiple consumers?
 What if there’s 1 full buffer, and multiple consumers
call down(full) at the same time?
118
Problem with Semaphore
 Is there any problem with semaphore?
 The order of down and up ops are critical
–Improper ordering could cause deadlock
–i.e. programming in semaphore is tricky
 How to make programming easier?
119
Monitor
 A programming language construct
 A collection of procedures, variables, data
structures
 Access to it is guaranteed to be exclusive
–By the compiler (not programmer)
120
Monitor
 Monitors use separate mechanisms for the two types of
synchronization
 use locks for mutual exclusion
 use condition variables for ordering constraints
121
Monitor
 A monitor = a lock + the condition variables
associated with that lock
122
Condition Variables
 Main idea:
–make it possible for thread to sleep inside a critical
section
 Approach:
–by atomically releasing lock, putting thread on wait
queue and go to sleep
123
Condition Variables
 Each variable has a queue of waiting threads
–threads that are sleeping, waiting for a condition
 Each variable is associated with one lock
124
Monitors
Monitor example
integer i;
condition c;
procedure producer();
…
end;
procedure consumer();
…
end
end monitor
125
Ops on Condition Variables
 wait():
–atomically release lock
–put thread on condition wait queue, go to sleep
–i.e. start to wait for wakeup
126
Ops on Condition Variables
 signal():
–wake up a thread waiting on this condition variable if
any
 broadcast():
–wake up all threads waiting on this condition variable
if any
127
Ops on Condition Variables
 Note that thread must be holding lock when it calls
wait() or signal()
 To avoid problems when both threads are inside
monitor
–the signal() must be the last statement
128
Ops on Condition Variables
 What to do when a thread wakes up?
 Two options:
 Let the wakeup thread run
–i.e. signaler release the lock
 Let the caller and callee compete for lock
129
Mesa vs. Hoare Monitors
 The first option is called Hoare Monitor
–It gives special priority to the woken-up waiter
–signaling thread gives up lock
–woken-up waiter acquires lock
–signaling thread re-acquires lock after waiter unlocks
130
Mesa vs. Hoare Monitors
 The second option is Mesa Monitor
–when waiter is woken, it must contend for the lock
with other threads
–hence must re-check condition
–Whoever wins get to run
131
Mesa vs. Hoare Monitors
 We’ll stick with Mesa monitors
–as do most operating systems
–Because it is more flexible
132
Programming with Monitors
 List shared data needed to solve problem
 Decide how many locks are needed
 Decide which locks will protect which data
133
Programming with Monitors
 More locks allows different data to be accessed
simultaneously
–i.e. protecting finer-grained data
–but is more complicated
 One lock usually enough in this class
134
Programming with Monitors
 Call wait() when thread needs to wait for a condition
to be true;
 Use a while loop to re-check condition after wait
returns
 Call signal when a condition changes that another
thread might be interested in
137
Producer-Consumer in Monitor
 Variables:
 numCokes (number of cokes in machine)
 One lock to protect this shared data
–cokeLock
 fewer locks make the programming simpler
–but allow less concurrency
138
Producer-Consumer in Monitor
 Ordering constraints:
 Consumer must wait for producer to fill buffer if all
buffers are empty
 Producer must wait for consumer to empty buffer if
buffers is completely full
139
Producer-Consumer in Monitor
monitor ProducerConsumer
condition full, empty;
integer count;
procedure insert (item: integer);
begin
if count == N then wait(full);
insert_item(item); count++;
if count ==1 then signal(empty);
end;
function remove: integer;
begin
if count ==0 then wait(empty);
remove = remove_item; count--;
if count==N-1 then signal(full);
end;
count:=0;
end monitor;
140
Producer-Consumer in Monitor
procedure producer;
begin
while true do
begin
item = produce_item;
ProducerConsumer.insert(item);
end
end;
procedure consumer;
begin
while true do
begin
item=ProducerConsumer.remove;
consume_item(item);
end
end;
141
Condition Variable vs. Semaphore
 Condition variables are more flexible than using
semaphores for ordering constraints
 Condition variables:
–can use arbitrary condition to wait
 Semaphores:
–wait if semaphore value == 0
145
Problems with Monitor
 Many languages do not support
 Don’t resolve problem when multiple CPUs or computer
are involved
–This applies to semaphore too
 Solution?
–Message passing
147
Message Passing
 A high-level primitive for IPC
 It uses two primitives: send and receive
–Send(destination, &message)
–Receive(source, &message)
 They are system calls (not language constructs)
 Can either block or return immediately
148
Message Passing
#define N 100
/*# of slots in the buffer*/Void
producer(void)
{
Int item;
message m;
/*message buffer*/
While(TRUE) {
Item=produce_item();
receive(consumer, &m);
/*wait for an empty */
Build_messasge(&m, item);
/*construct a msg to*/
Send(consumer, &m);
}
}
149
Message Passing
Void consumer(void)
{ Int item;
message m;
for(i=0;i<N;i++) send(producer, &m);
/* send N empties */
While(TRUE) {
receive(producer, &m);
item=extract_item(&m);
send(producer, &m);
consume_item(item);
}
}
150
Problems with Message Passing
 Have many challenging problems
 Message loss
–What happens when a message is lost
 Authentication
–How to determine the identity of the sender?
 Performance
151
Barriers
 Another synchronization primitive
 Intended for a group of processes
 All processes must reach the barrier for the
applications to proceed to next phase
152
Barriers
–processes approaching a barrier
–all processes but one blocked at barrier
–last process arrives, all are let through
153
Implementing Locks
 So far we have used locks extensively
 We assumed that lock operations are atomic
 But how atomicity of lock is implemented?
154
Implementing Locks
 Lock must be implemented on hardware ops
Concurrent programs
High level synchronization operations provided
by software (i.e. semaphore, lock, monitor)
Low-level atomic operations provide by hardware
(i.e. load/store, interrupt enable/disable, test & set)
155
Use Interrupt Disable/Enable
 On uniprocessor, operation is atomic as long as
–context switch doesn’t occur in middle of operation
 How does thread get context switched out?
–interrupt
 Prevent context switches at wrong time by preventing
these events
156
Use Interrupt Disable/Enable
 With interrupt disable/enable to ensure atomicity,
–why do we need locks?
 User program could call interrupt disable
–before entering critical section
–and call interrupt enable after leaving critical
section
–and make sure not to call yield in the critical
section
157
Lock Implementation #1
 Disable interrupts with busy waiting
158
Lock Implementation #1
lock() {
disable interrupts
while (value != FREE) {
enable interrupts
disable interrupts
}
value = BUSY
enable interrupts
}
Why does lock() disable
interrupts in the beginning of the
function?
Why is it ok to disable interrupts
in lock()’s critical section
it wasn’t ok to disable
interrupts while user code was
running?
159
Lock Implementation #1
unlock() {
disable interrupts
value = FREE
enable interrupts
}
Do we need to disable interrupts in unlock()?
160
Problems with Interrupt Approach
 Interrupt disable works on a uniprocessor
–by preventing current thread from being switched out
 But this doesn’t work on a multi-processor
 Disabling interrupts on one processor doesn’t prevent
other processors from running
 Not acceptable (or provided) to modify interrupt
disable to stop other processors from running
161
Read-modify-write Instructions
 Another atomic primitive
 Use atomic load / atomic store instructions
–remember Gold Fishsolution #3
162
Read-modify-write Instructions
 Modern processors provide an easier way
–with atomic read modify-write instructions
 Read-modify-write atomically
–reads value from memory into a register
–Then writes new value to that memory location
163
Read-modify-write Instructions
 test_and_set
–atomically writes 1 to a memory location (set)
–and returns the value that used to be there (test)
test_and_set(X) {
tmp = X
X = 1
return(tmp)
}
164
Lock Implementation #2
 Test & set with busy waiting
(value is initially 0)
lock() {
while (test_and_set(value) == 1) {}
}
unlock() {
value = 0
}
165
Lock Implementation #2
 If lock is free (value = 0)
–test_and_set sets value to 1 and returns 0,
–so the while loop finishes
 If lock is busy (value = 1)
–test_and_set doesn’t change the value and returns 1,
–so loop continues
166
Strategy for Reducing Busy-Waiting
 In method 1 & 2, Waiting thread uses lots of CPU time
just checking for lock to become free
 Better for it to sleep and let other threads run
167
Lock Implementation #3
 Interrupt disable, no busy-waiting
 Waiting thread gives up processor so that other
threads (e.g. thread with lock) can run more quickly
 Someone wakes up thread when the lock is free
168
Lock Implementation #3
lock() {
disable interrupts
if (value == FREE) {
value = BUSY
} else {
add thread to queue of threads waiting for this lock
switch to next runnable thread
}
enable interrupts
}
When should lock() re-enable interrupts before calling switch?
169
Lock Implementation #3
unlock() {
disable interrupts
value = FREE
if (any thread is waiting for this lock) {
move waiting thread from waiting queue to ready queue
value = BUSY
}
enable interrupts
}
170
Interrupt Disable/Enable Pattern
Enable interrupts before adding thread to wait queue?
lock() {
disable interrupts
...
if (lock is busy) {
When could this fail?
enable interrupts
add thread to lock wait queue
switch to next run-able thread
}
172
Interrupt Disable/Enable Pattern
 Enable interrupts after adding thread to wait queue,
but before switching to next thread?
173
Interrupt Disable/Enable Pattern
lock() {
disable interrupts
...
if (lock is busy) {
add thread to lock wait queue
enable interrupts
switch to next runnable thread
}
174
Interrupt Disable/Enable Pattern
 But this fails if interrupt happens after thread
enable interrupts
–lock() adds thread to wait queue
–lock() enables interrupts
–interrupt causes preemption,
–i.e. switch to another thread.
175
Interrupt Disable/Enable Pattern
 Preemption moves thread to ready queue
–Now thread is on two queues (wait and ready)!
 Also, switch is likely to be a critical section
–Adding thread to wait queue and switching to next
thread must be atomic
176
Solution
 Waiting thread leaves interrupts disabled
–when it calls switch
 Next thread to run has the responsibility of
–re-enabling interrupts before returning to user code
 When waiting thread wakes up
–it returns from switch with interrupts disabled from
the last thread
177
Invariant
 All threads promise to have interrupts disabled when
they call switch
 All threads promise to re-enable interrupts after
they get returned to from switch
178
Thread A
Thread B
yield() {
disable
interrupts
switch
enable interrupts
}
<user code runs>
lock() {
disable interrupts
...
switch
back from
switch
enable
interrupts
}
<user code runs>
unlock() (move
thread A to ready queue)
yield() {
disable
interrupts
179
Lock Implementation #4
 Test & set, minimal busy-waiting
 Can’t implement locks using test & set without some
amount of busy-waiting
–but can minimize it
 Idea:
–use busy waiting only to atomically execute lock code
–Give up CPU if busy
180
Lock Implementation #4
lock() {
while(test&set(guard)) {
}
if (value == FREE) {
value = BUSY
} else {
add thread to queue of threads waiting for this lock
switch to next runnable thread
}
guard = 0
}
181
Lock Implementation #4
unlock() {
while (test&set(guard)) {
}
value = FREE
if (any thread is waiting for this lock) {
move waiting thread from waiting queue to ready queue
value = BUSY
}
guard = 0
}
182
CPU Scheduling
 How should one choose next thread to run? What are
the goals of the CPU scheduler?
 Bursts of CPU usage alternate with periods of I/O
wait
183
CPU Scheduling
(a) CPU-bound process
(b) I/O bound process
184
CPU Scheduling
 Minimize average response time
–average elapsed time to do each job
 Maximize throughput of entire system
–rate at which jobs complete in the system
 Fairness
–share CPU among threads in some “equitable” manner
185
Scheduling Algorithm Goals
 All systems
 Fairness:
–giving each process a fair share of the CPU
 Policy enforcement:
–seeing that stated policy is carried out
 Balance:
–keeping all parts of the system busy
186
Scheduling Algorithm Goals
 Batch systems
 Throughput:
–maximize jobs per hour
 Turnaround time:
–minimize time between submission and termination
 CPU utilization:
–keep the CPU busy all the time
187
Scheduling Algorithm Goals
 Interactive systems
–Response time – respond to requests quickly
–Proportionality – meet users’ expectations
 Real-time systems
–Meeting deadlines – avoid losing data
–Predictability – avoid quality degradation in
multimedia systems
188
FCFS
 First-Come, First-Served
 FIFO ordering between jobs
 No preemption (run until done)
–thread runs until it calls yield() or blocks on I/O
–no timer interrupts
189
FCFS
 Pros and cons
 + simple
 - short jobs get stuck behind long jobs
 - what about the user’s interactive experience?
190
FCFS
 Example
–job A takes 100 seconds
–job B takes 1 second
–time 0:
–time 0+:
job A arrives and starts
job B arrives
–time 100:
starts
job A ends (response time=100); job B
–time 101:
job B ends (response time = 101)
 average response time = 100.5
191
Round Robin
 Goal:
–improve average response time for short jobs
 Solution:
–periodically preempt all jobs (viz. long-running ones)
 Is FCFS or round robin more “fair”?
192
Round Robin
 Example
 job A takes 100 seconds
 job B takes 1 second
 time slice of 1 second
–a job is preempted after running for 1 second
193
Round Robin
–time 0:
–time 0+:
job A arrives and starts
job B arrives
–time 1:
job A is preempted; job B starts
–time 2:
job B ends (response time = 2)
–time 101:
job A ends (response time = 101)
 average response time = 51.5
194
Round Robin
 Does round-robin always achieve lower response time
than FCFS?
195
Round Robin
 Pros and cons
 + good for interactive computing
 - round robin has more overhead due to context
switches
196
Round Robin
 How to choose time slice?
 big time slice:
–degrades to FCFS
 small time slice:
–each context switch wastes some time
197
Round Robin
 typically a compromise
–10 milliseconds (ms)
 if context switch takes .1 ms
–then round robin with 10 ms slice wastes 1% of CPU
198
STCF
 Shortest Time to Completion First
 STCF: run whatever job has the least amount of work
to do before it finishes or blocks for an I/O
199
STCF
 STCF-P: preemptive version of STCF
 if a new job arrives that has less work than the
current job has remaining
–then preempt the current job in favor of the new one
200
STCF
 Idea is to finish the short jobs first
 Improves response time of shorter jobs by a lot
–Doesn’t hurt response time of longer jobs by too much
201
STCF
 STCF gives optimal response time among non-preemptive
policies
 STCF-P gives optimal response time among preemptive
policies (and non-preemptive policies)
202
STCF
 Is the following job a “short” or “long” job?
–while(1) {
–
use CPU for 1 ms
–
use I/O for 10 ms
–}
203
STCF
 Pros and cons
 + optimal average response time
 - unfair. Short jobs can prevent long jobs from ever
getting any CPU time (starvation)
 - needs knowledge of future
204
STCF
 STCF and STCF-P need knowledge of future
–it’s often very handy to know the future :-)
–how to find out the future time required by a job?
205
Example
job A
compute for 1000 seconds
job B
compute for 1000 seconds
job C
while(1) {
use CPU for 1 ms
use I/O for 10 ms
}
206
Example
 C can use 91% of the disk by itself
 A or B can each use 100% of the CPU
 What happens when we run them together?
 Goal: keep both CPU and disk busy
207
FCFS
 if A or B run before C,
 they prevent C from issuing its disk I/O for
–up to 2000 seconds
208
Round Robin
 with 100 ms time slice
–CA---------B---------CA---------B---------...
–|--|
–C’s
–I/O
 Disk is idle most of the time that A & B are running
–about 10 ms disk time every 200 ms
209
Round Robin
 with 1 ms time slice
–CABABABABABCABABABABABC...
–|--------| |--------|
–C’s
C’s
–I/O
I/O
 C runs more often, so it can issue its disk I/O
almost as soon as its last disk I/O is done
210
Round Robin with 1ms
 Disk is utilized almost 90% of the time
 Little effect on A or B’s performance
 General principle:
–first start the things that can run in parallel
 problem:
–lots of context switches (and context switch overhead)
211
STCF-P
 Runs C as soon as its disk I/O is done
–because it has the shortest next CPU burst
–CA-------CA-------------CA--------- ...
–|--------|
–C’s
|--------|
C’s
|---------|
C’s
–I/O
I/O
I/O
212
Real-Time Scheduling
 So far, we’ve focused on average-case analysis
–average response time, throughput
 Sometimes, the right goal is to get each job done
before its deadline
–irrelevant how much before deadline job completes
213
Real-Time Scheduling
 Video or audio output.
–E.g. NTSC outputs 1 TV frame every 33 ms
 Control of physical systems
–e.g. auto assembly, nuclear power plants
214
Real-Time Scheduling
 This requires worst-case analysis
 How do we do this in real life?
215
EDF
 Earliest-deadline first
 Always run the job that has the earliest deadline
–i.e. the deadline coming up next
 If a new job arrives with an earlier deadline than
the currently running job
–preempt the running job and start the new one
216
EDF
 EDF is optimal
–it will meet all deadlines if it’s possible to do so
217
EDF
 Example
 job A: takes 15 seconds, deadline is 20 seconds after
entering system
 job B: takes 10 seconds, deadline is 30 seconds after
entering system
 job C: takes 5 seconds, deadline is 10 seconds after
entering system
218
EDF
time--->
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85
A +
B +
C +
219
Schedulability in Real-Time Systems
Not all set of tasks are schedulable in RT Systems
 Given
–m periodic events
–event i occurs within period Pi and requires Ci seconds
 Then the load can only be handled if
m
Ci
1

i 1 Pi
220
Scheduling in Batch Systems
 First come first serve
 Shortest job first
 Shortest remaining time next
 Three-level scheduling
221
Scheduling in Batch Systems
An example of shortest job first scheduling
222
Scheduling in Batch Systems
Three level scheduling
223
Scheduling in Interactive Systems
 Round-robin scheduling
 Priority scheduling
–Run highest priority process until it blocks or exits
–Alternative, decrease priority at each clock tick
 Multiple queue
–Dividing priority into classes
–Dynamically adjust a process’s priority class
224
Scheduling in Interactive Systems
 Shortest process next
–Estimate the running time of processes
 Guaranteed scheduling
–Each process runs 1/n fraction of time
 Lottery scheduling
–Issue lottery tickets to process & schedule
accordingly
 Fair share scheduling per user
225
Scheduling in Interactive Systems
 Round Robin Scheduling
–list of runnable processes
–list of runnable processes after B uses up its quantum
226
Scheduling in Interactive Systems
A scheduling algorithm with four priority classes
227
Policy versus Mechanism
 Separate what is allowed to be done
–with how it is done
 Important in thread scheduling
–a process knows which of its children threads are
important and need priority
228
Policy versus Mechanism
 Scheduling algorithm parameterized
–mechanism in the kernel
 Parameters filled in by user processes
–policy set by user process
229
Thread Scheduling
Possible
scheduling
of user-level
threads
230
Thread Scheduling
Possible
scheduling
of kernellevel threads
231
Thoughts Change Life
意念改变生活