CS 519 -- Operating Systems -

Download Report

Transcript CS 519 -- Operating Systems -

CS 519: Lecture 2
Processes, Threads, and Synchronization
Assignment 1
 Design and implement a threads package (details left out on
purpose!)
 Check the Web page and the newsgroup for further details
 Groups of 3 students. Anybody still looking for a group?
 Due 2/12/2001 at 5pm
CS 519
2
Operating System Theory
Introduction: Von Neuman Model
 Both text (program) and data reside in
memory
 Execution cycle
 Fetch instruction
 Decode instruction
 Execute instruction
CPU
 OS is just a program
 OS text and data reside in memory too
 Invoke OS functionality through
procedure calls
CS 519
3
Memory
Operating System Theory
Execution Mode
 Most processors support at least two modes of execution for
protection reasons
 Privileged - kernel-mode
 Non-privileged - user-mode
 The portion of the OS that executes in kernel-mode is called
the kernel
 Access hardware resources
 Protected from interference by user programs
 Code running in kernel-mode can do anything - no protection
 User code executes in user-mode
 OS functionality that does not need direct access to hardware
may run in user-mode
CS 519
4
Operating System Theory
Interrupts and traps
 Interrupt: an asynchronous event
 External events which occur independently of the instruction
execution in the processor, e.g. DMA completion
 Can be masked (specifically or not)
 Trap: a synchronous event
 Conditionally or unconditionally caused by the execution of the
current instruction, e.g. floating point error, trap instruction
 Interrupt and trap events are predefined (typically as integers)
 Each interrupt and trap has an associated interrupt vector
 Interrupt vector specifies a handler that should be called when the
event occurs
 Interrupts and traps force the processor to save the current
state of execution and jump to the handler
CS 519
5
Operating System Theory
Process
 An “instantiation” of a program
 System abstraction – the set of resources required for executing
a program
 Execution context(s)
 Address space
 File handles, communication endpoints, etc.
 Historically, all of the above “lumped” into a single abstraction
 More recently, split into several abstractions
 Threads, address space, protection domain, etc.
 OS process management:
 Supports creation of processes and interprocess communication (IPC)
 Allocates resources to processes according to specific policies
 Interleaves the execution of multiple processes to increase system
utilization
CS 519
6
Operating System Theory
Process Image
 The physical representation of a process in the OS
 A process control structure includes
Identification info: process, parent process, user
Control info: scheduling (state, priority), resources (memory,
opened files), IPC
Execution contexts – threads
An address space consisting of code, data, and stack
segments
CS 519
7
Operating System Theory
Process Address Space
A:
OS
Code
C
Globals
B
B:
A’s activation
frame
Stack
…
C()
…
C:
Heap
CS 519
…
B()
…
8
Operating System Theory
Virtual Memory
 Process addresses are virtual memory addresses
 Mapping from virtual memory to physical memory
Details next lecture
Translation: Translation Look-aside Buffer (TLB), page table
 Will cover more next time …
CS 519
9
Operating System Theory
User Mode
 When running in user-mode, a process can only access
its virtual memory and processor resources
(registers) directly
 All other resources can only be accessed through the
kernel by “calling the system”
System call
A system call is a call because it looks like a procedure call
It’s a system call because, in reality, it’s a software trap
 Why is a system call a trap instead of a procedure
call?
CS 519
10
Operating System Theory
User Mode
 When running in user-mode, a process can only access
its virtual memory and processor resources
(registers) directly
 All other resources can only be accessed through the
kernel by “calling the system”
System call
A system call is a call because it looks like a procedure call
It’s a system call because, in actuality, it’s a software trap
 Why is a system call a trap instead of a procedure
call? Need to switch to kernel mode
CS 519
11
Operating System Theory
Mode switching
 One or several bits in the processor status word (PSW) indicate
the current mode of execution
 Sometimes also the previous mode
 (Other bits in PSW: interrupt enabled/disabled flags, alignment
check bit, arithmetic overflow bit, parity bit, carry bit, etc.)
 Mode bits can be changed explicitly while in kernel-mode but
not in user-mode
 The processor status word can be loaded from the interrupt
vector along with the address of the interrupt handler
 Setting of kernel mode bit in the interrupt vector allows
interrupt and trap handlers to be “automatically” executed in
kernel mode
CS 519
12
Operating System Theory
System Call In Monolithic OS
interrupt vector for trap instruction
PC
PSW
in-kernel file system(monolithic OS)
code for read system call
kernel mode
trap
iret
user mode
read(....)
CS 519
13
Operating System Theory
Signals
 OS may need to “upcall” into user processes
 Signals
UNIX mechanism to upcall when an event of interest occurs
Potentially interesting events are predefined: e.g.,
segmentation violation, message arrival, kill, etc.
When interested in “handling” a particular event (signal), a
process indicates its interest to the OS and gives the OS a
procedure that should be invoked in the upcall.
CS 519
14
Operating System Theory
Signals (Cont’d)
 When an event of interest occurs
the kernel handles the event first,
then modifies the process’s stack to
look as if the process’s code made a
procedure call to the signal handler.
 When the user process is scheduled
next it executes the handler first
 From the handler the user process
returns to where it was when the
event occurred
CS 519
15
Handler
B
B
A
A
Operating System Theory
Process Creation
 How to create a process? System call.
 In UNIX, a process can create another process using the fork()
system call
 int pid = fork()
 The creating process is called the parent and the new process is
called the child
 The child process is created as a copy of the parent process
(process image and process control structure) except for the
identification and scheduling state
 Parent and child processes run in two different address spaces
 by default no memory sharing
 Process creation is expensive because of this copying
CS 519
16
Operating System Theory
Example of Process Creation Using Fork
 The UNIX shell is command-line interpreter whose basic
purpose is for user to run applications on a UNIX system
 cmd arg1 arg2 ... argn
CS 519
17
Operating System Theory
Inter-Process Communication
 Most operating systems provide several abstractions
for inter-process communication: message passing,
shared memory, etc
 Communication requires synchronization between
processes (i.e. data must be produced before it is
consumed)
 Synchronization can be implicit (message passing) or
explicit (shared memory)
 Explicit synchronization can be provided by the OS
(semaphores, monitors, etc) or can be achieved
exclusively in user-mode (if processes share memory)
CS 519
18
Operating System Theory
Inter-Process Communication
 More on shared memory and message passing later
 Synchronization after we talk about threads
CS 519
19
Operating System Theory
Threads
 Why limit ourselves to a single execution context?
 For example, have to use select() to deal with multiple outstanding
events. Having multiple execution contexts is more natural.
 Multiprocessor systems
 Multiple execution contexts  threads
 All the threads of a process share the same address space and
the same resources
 Each thread contains
 An execution state: running, ready, etc..
 An execution context: PC, SP, other registers
 A per-thread stack
CS 519
20
Operating System Theory
Process Address Space Revisited
OS
OS
Code
Code
Globals
Globals
Stack
Stack
Stack
Heap
Heap
(a) Single-threaded address space (b) Multi-threaded address space
CS 519
21
Operating System Theory
Threads vs. Processes
 Why multiple threads?
 Can’t we use multiple processes to do whatever it is that we do with
multiple threads?
Sure, but we would need to be able to share memory (and other resources)
between multiple processes
This sharing is directly supported by threads – see later in the lecture
 Operations on threads (creation, termination, scheduling, etc) are
cheaper than the corresponding operations on processes
This is because thread operations do not involve manipulations of other
resources associated with processes
 Inter-thread communication is supported through shared memory
without kernel intervention
 Why not? Have multiple other resources, why not execution
contexts
CS 519
22
Operating System Theory
Process state diagram
ready
(in
memory)
swap in
swap out
suspended
(swapped out)
CS 519
23
Operating System Theory
Thread State Diagram
thread scheduling
dispatch
ready
running
timeout
event occurred
suspend
activate
suspend
suspended
(swapped out)
wait for
event
blocked
process
scheduling
CS 519
24
Operating System Theory
Thread Switching
 Typically referred to as context switching
 Context switching is the act of taking a thread off of
the processor and replacing it with another one that
is waiting to run
 A context switch takes place when
Time quota allocated to the executing thread expires
The executing thread performs a blocking system call
A memory fault due to a page miss
CS 519
25
Operating System Theory
Context Switching
 How to do a context switch?
Very carefully!!
 Save state of currently executing thread
Copy all “live” registers to thread control block
Need at least 1 scratch register – points to area of memory
in thread control block that registers should be saved to
 Restore state of thread to run next
Copy values of live registers from thread control block to
registers
 How to get into and out of the context switching
code?
CS 519
26
Operating System Theory
Context Switching
 OS is just code stored in memory, so …
 Call context switching subroutine
 The subroutine saves context of current thread, restores context
of the next thread to be executed, and returns
 The subroutine returns in the context of another (the next) thread!
 Eventually, will switch back to the current thread
 To thread, it appears as if the context switching subroutine just
took a long while to return
CS 519
27
Operating System Theory
Thread Switching (Cont’d)
 What if switching to a thread of a different
process? Context switch to process.
 Implications for caches, TLB, page table, etc.?
Caches
Physical addresses: no problem
Virtual addresses: cache must either have process tag or must
flush cache on context switch
TLB
Each entry must have process tag or flush TLB on context switch
Page table
Typically have page table pointer (register) that must be reloaded
on context switch
CS 519
28
Operating System Theory
Process Swapping
 May want to swap out entire process
Thrashing if too many processes competing for resources
 To swap out a process
Suspend all of its threads
Must keep track of whether thread was blocked or ready
Copy all of its information to backing store (except for PCB)
 To swap a process back in
Copy needed information back into memory, e.g. page table,
thread control blocks
Restore each thread to blocked or ready
Must check whether event(s) has (have) already occurred
CS 519
29
Operating System Theory
Threads & Signals
(Usual Unix Implementation)
 A signal is directed to a single thread (we don’t want a signal
delivered multiple times), but the handler is associated with the
process
 Signals can be ignored/masked by threads. Common strategy:
one thread responsible to field all signals (calls sigwait() in an
infinite loop); all other threads mask all signals
 What happens if kernel wants to signal a process when all of its
threads are blocked? When there are multiple threads, which
thread should the kernel deliver the signal to?
 OS writes into PCB that a signal should be delivered to the process
 If one thread is responsible for all signals, it is scheduled to run
and fields the signal
 Otherwise, the next time any thread that does not mask the signal
is allowed to run, the signal is delivered to that thread as part of
the context switch
CS 519
30
Operating System Theory
POSIX Threads (Pthreads) API Standard

thread creation and termination
pthread_create(&tid,NULL,start_fn,arg);
pthread_exit(status)’



thread join
pthread_join(tid, &status);
mutual exclusion
pthread_mutex_lock(&lock);
pthread_mutex_unlock(&lock);
condition variable
pthread_cond_wait(&c,&lock);
pthread_cond_signal(&c);
CS 519
31
Operating System Theory
Thread Implementation
 Kernel-level threads (lightweight processes)
Kernel sees multiple execution contexts
Thread management done by the kernel
 User-level threads
Implemented as a thread library which contains the code for
thread creation, termination, scheduling and switching
Kernel sees one execution context and is unaware of thread
activity
Can be preemptive or not
CS 519
32
Operating System Theory
User-Level vs. Kernel-Level Threads
 Advantages of user-level threads
Performance: low-cost thread operations and context
switching, since it is not necessary to go into the kernel
Flexibility: scheduling can be application-specific
Portability: user-level thread library easy to port
 Disadvantages of user-level threads
If a user-level thread is blocked in the kernel, the entire
process (all threads of that process) are blocked
Cannot take advantage of multiprocessing (the kernel assigns
one process to only one processor)
CS 519
33
Operating System Theory
User-Level vs. Kernel-Level Threads
user-level
threads
kernel-level
threads
threads
thread
scheduling
user
kernel
threads
process
thread
scheduling
process
scheduling
processor
processor
CS 519
process
scheduling
34
Operating System Theory
User-Level vs. Kernel-Level Threads
 No reason why we shouldn’t
have both (e.g. Solaris)
 Most systems now support
kernel threads
 User-level threads are
available as linkable libraries
user-level
threads
thread
scheduling
user
kernel
kernel-level
threads
thread
scheduling
process
scheduling
processor
CS 519
35
Operating System Theory
Kernel Support for User-Level Threads
 Even kernel threads are not quite the right
abstraction for supporting user-level threads
 Mismatch between where the scheduling information
is available (user) and where scheduling on real
processors is performed (kernel)
When the kernel thread is blocked, the corresponding
physical processor is lost to all user-level threads although
there may be some ready to run.
CS 519
36
Operating System Theory
Why Kernel Threads Are Not The Right
Abstraction
user-level threads
user-level scheduling
user
kernel
kernel thread
blocked
kernel-level scheduling
physical processor
CS 519
37
Operating System Theory
Scheduler Activations
 Kernel allocates processors to processes as scheduler
activations
 Each process contains a user-level thread system
which controls the scheduling of the allocated
processors
 Kernel notifies a process whenever the number of
allocated processors changes or when a scheduler
activation is blocked due to the user-level thread
running on it
 The process notifies the kernel when it needs more
or fewer scheduler activations (processors)
CS 519
38
Operating System Theory
User-Level Threads On Top of
Scheduler Activations
user-level threads
user-level scheduling
blocked
active
user
kernel
scheduler activation
kernel-level scheduling
blocked active
physical processor
CS 519
39
Operating System Theory
Thread/Process Operation Latencies
Operation
User-level
Threads
(s)
Kernel
Threads
(s)
Processes
(s)
Null fork
34
948
11,300
Signal-wait
37
441
1,840
VAX uniprocessor running UNIX-like OS, 1992.
CS 519
40
Operating System Theory
Synchronization
 Why synchronization?
 Problem
Threads share data
Data integrity must be maintained
 Example
Transfer $10 from account A to account B
A  A + 10
B  B - 10
If was taking snapshot of account balances, don’t want to
read A and B between the previous two statements
CS 519
41
Operating System Theory
Terminology
 Critical section: a section of code which reads or
writes shared data
 Race condition: potential for interleaved execution of
a critical section by multiple threads
Results are non-deterministic
 Mutual exclusion: synchronization mechanism to avoid
race conditions by ensuring exclusive execution of
critical sections
 Deadlock: permanent blocking of threads
 Livelock: execution but no progress
 Starvation: one or more threads denied resources
CS 519
42
Operating System Theory
Requirements for Mutual Exclusion
 No assumptions on hardware: speed, # of processors
 Execution of CS takes a finite time
 A thread/process not in CS cannot prevent other
threads/processes to enter the CS
 Entering CS cannot de delayed indefinitely: no
deadlock or starvation
CS 519
43
Operating System Theory
Synchronization Primitives
 Most common primitives
Mutual exclusion
Condition variables
Semaphores
Monitors
 Need
Semaphores
Mutual exclusion and condition variables
CS 519
44
Operating System Theory
Mutual Exclusion
 Mutual exclusion used when want to be
only thread modifying a set of data items
 Have three components:
 Lock, Unlock, Waiting
 Example: transferring $10 from A to B
Lock(A)
Lock(B)
A  A + 10
B  B - 10
Unlock(B)
Unlock(A)
CS 519
Function Transfer (Amount, A, B)
Lock(Transfer_Lock)
A  A + 10
B  B - 10
Unlock(Transfer_Lock)
45
Operating System Theory
Implementation of Mutual Exclusion
 Many read/write algorithms for mutual exclusion
See any OS book
Disadvantages: difficult to get correct and not very general
 Simple with a little help from the hardware
 Atomic read-modify-write instructions
Test-and-set
Fetch-and-increment
Compare-and-swap
CS 519
46
Operating System Theory
Atomic Read-Modify-Write Instructions
 Test-and-set
Read a memory location and, if the value is 0, set it to 1 and
return true. Otherwise, return false
 Fetch-and-increment
Return the current value of a memory location and increment
the value in memory by 1
 Compare-and-swap
Compare the value of a memory location with an old value,
and if the same, replace with a new value
CS 519
47
Operating System Theory
Simple Spin Lock
 Test-and-set
Spin_lock(lock)
{
while (test-and-set(lock) == 0);
}
Spin_unlock(lock)
{
lock = 0;
}
CS 519
48
Operating System Theory
Load-Locked, Store-Conditional
 Load-Linked, Store-Conditional (LLSC)
Pair of instructions may be easier to implement in hardware
Load-linked (or load-locked) returns the value of a memory
location
Store-conditional stores a new value to the same memory
location if the value of that location has not been changed
since the LL. Returns 0 or 1 to indicate success or failure
If a thread is switched out between an LL and an SC, then
the SC automatically fails
CS 519
49
Operating System Theory
What To Do While Waiting?
 Spinning
Waiting threads keep testing location until it changes value
Doesn’t work on uniprocessor system
 Blocking
OS or RT system deschedules waiting threads
 Spinning vs. blocking becomes an issue in
multiprocessor systems (with > 1 thread/processor)
 What is the main trade-off?
CS 519
50
Operating System Theory
What To Do While Waiting?
 Spinning
Waiting threads keep testing location until it changes value
Doesn’t work on uniprocessor system
 Blocking
OS or RT system deschedules waiting threads
 Spinning vs. blocking becomes an issue in
multiprocessor systems (with > 1 thread/processor)
 What is the main trade-off? Overhead of blocking vs.
expected waiting time
CS 519
51
Operating System Theory
Deadlock
CS 519
Lock A
Lock B
A
B
52
Operating System Theory
Deadlock
CS 519
Lock A
Lock B
A
B
53
Operating System Theory
Deadlock
CS 519
Lock A
Lock B
A
B
54
Operating System Theory
Deadlock (Cont’d)
 Deadlock can occur whenever multiple parties are competing for
exclusive access to multiple resources
 How can we deal deadlocks?
 Deadlock prevention
Design a system without one of mutual exclusion, hold and wait, no
preemption or circular wait (four necessary conditions)
To prevent circular wait, impose a strict ordering on resources. For instance,
if need to lock variables A and B, always lock A first, then lock B
 Deadlock avoidance
Deny requests that may lead to unsafe states (Banker’s algorithm)
Running the algorithm on all resource requests is expensive
 Deadlock detection and recovery
Check for circular wait periodically. If circular wait is found, abort all
deadlocked processes (extreme solution but very common)
Checking for circular wait is expensive
CS 519
55
Operating System Theory
Condition Variables
 A condition variable is always associated with a
condition and a lock
 Typically used to wait for a condition to take on a
given value
 Three operations:
cond_wait(lock, cond_var)
cond_signal(cond_var)
cond_broadcast(cond_var)
CS 519
56
Operating System Theory
Condition Variables
 cond_wait(lock, cond_var)
Release the lock
Sleep on cond_var
When awaken by the system, reacquire the lock and return
 cond_signal(cond_var)
If at least 1 thread is sleeping on cond_var, wake 1 up
Otherwise, no effect
 cond_broadcast(cond_var)
If at least 1 thread is sleeping on cond_var, wake everyone up
Otherwise, no effect
CS 519
57
Operating System Theory
Producer/Consumer Example
Producer
Consumer
lock(lock_bp)
while (free_bp.is_empty())
cond_wait(lock_bp, cond_freebp_empty)
buffer  free_bp.get_buffer()
unlock(lock_bp)
lock(lock_bp)
while (data_bp.is_empty())
cond_wait(lock_bp, cond_databp_empty)
buffer  data_bp.get_buffer()
unlock(lock_bp)
… produce data in buffer …
… consume data in buffer …
lock(lock_bp)
data_bp.add_buffer(buffer)
cond_signal(cond_databp_empty)
unlock(lock_bp)
lock(lock_bp)
free_bp.add_buffer(buffer)
cond_signal(cond_freebp_empty)
unlock(lock_bp)
CS 519
58
Operating System Theory
Condition Variables
 Condition variables are implemented using locks
 Implementation is tricky because it involves multiple
locks and scheduling queue
 Implemented in the OS or run-time thread systems
because they involve scheduling operations
Sleep/Wakeup
CS 519
59
Operating System Theory
Semaphores
 Synchronized counting variables
 A semaphore is comprised of:
An integer value
Two operations: P() and V()
 P()
Decrement value
While value < 0, sleep
 V()
Increments value
If there are any threads sleeping, waiting for value to
become zero, wakeup 1 thread
CS 519
60
Operating System Theory
Monitors
 Semaphores have a few limitations: unstructured,
difficult to program correctly. Monitors eliminate
these limitations and are as powerful as semaphores
 A monitor consists of a software module with one or
more procedures, an initialization sequence, and local
data (can only be accessed by procedures)
 Only one process can execute within the monitor at
any one time (mutual exclusion)
 Synchronization within the monitor implemented with
condition variables (wait/signal) => one queue per
condition variable
CS 519
61
Operating System Theory
Lock and Wait-free Synchronization (or
Mutual Exclusion Considered Harmful)
 Problem: Critical sections
If a process is delayed or halted in a critical section, hard or
impossible for other processes to make progress
 Lock-free (aka non-blocking) concurrent objects
Some process will complete an operation on the object in a
finite number of steps
 Wait-free concurrent objects
Each process attempting to perform an operation on the
concurrent object will complete it in a finite number of steps
 Essential difference between these two?
CS 519
62
Operating System Theory
Lock and Wait-free Synchronization (or
Mutual Exclusion Considered Harmful)
 Problem: Critical sections
If a process is delayed or halted in a critical section, hard or
impossible for other processes to make progress
 Lock-free (aka non-blocking) concurrent objects
Some process will complete an operation on the object in a
finite number of steps
 Wait-free concurrent objects
Each process attempting to perform an operation on the
concurrent object will complete it in a finite number of steps
 Essential difference between these two? Latter
definition guarantees lack of starvation
CS 519
63
Operating System Theory
Herlihy’s Paper
 What’s the point?
Lock and wait-free synchronization can be very useful for
building multiprocessor systems; they provide progress
guarantees
Building lock and wait-free concurrent objects is HARD, so a
methodology for implementing these objects is required
Lock and wait-free synchronization provide guarantees but
can be rather expensive – lots of copying
 M. Greenwald and D. Cheriton. The Synergy between
Non-blocking Synchronization and Operating System
Structure. In Proceedings of OSDI ’96, Oct, 1996.
CS 519
64
Operating System Theory
Single Word Protocol
Fetch_and_add_sync(obj: integer, v: integer) returns(integer)
success: boolean := false
loop exit when success
old: integer := obj
new: integer := old+v
success := compare_and_swap(obj, old, new)
end loop
return old
end Fetch_and_add_sync
Lock-free but not wait-free!!
CS 519
65
Operating System Theory
Small Objects
 Small object: occupies 1 block (or less)
 Basic idea of lock-free object implementation
1.
2.
3.
4.
Allocate a new block
Copy object to new block, making sure copy is consistent
Modify copied block, i.e. perform operation on copied block
Try to atomically replace pointer to old block with new one
 Basic idea of wait-free object implementation
 Same as before, except that processes should announce
their operations in a shared data structure
 Any process in step 3 should perform all previously
announced operations, not just its own operations
CS 519
66
Operating System Theory
Single Word Protocol
Fetch_and_add_sync(obj: integer, v: integer) returns(integer)
success: boolean := false
loop exit when success
old: integer := obj
new: integer := old+v
// allocate, copy and modify
// single word so consistent
success := compare_and_swap(obj, old, new) // replace
end loop
return old
end Fetch_and_add_sync
Lock-free but not wait-free!!
CS 519
67
Operating System Theory
Large Objects
 I’ll leave for you to read
 Basic idea is to make local changes in the data
structure using small object protocol
 Complicated but not different than locking: e.g., if
you lock an entire tree to modify some nodes,
performance will go down the drain
CS 519
68
Operating System Theory