a thread

Download Report

Transcript a thread

Unit 3: Concurrency

Instructor: Hengming Zou, Ph.D.

In Pursuit of Absolute Simplicity 求于至

简，归于永恒

Outline of Content



3.1. Critical Sections, Semaphores, and Monitors



3.2. Windows Trap Dispatching, Interrupts, Synchronization



3.3. Advanced Windows Synchronization



3.4. Windows APIs for Synchronization and IPC

Critical Sections, Semaphores, and Monitors



The Critical-Section Problem



Software Solutions



Synchronization Hardware



Semaphores



Synchronization in Windows & Linux

The Critical-Section Problem



n threads all competing to use a shared resource



Each thread has a code segment, called critical section, in which the shared data is accessed



Problem: Ensure that:

–

when one thread is executing in its critical section, no other thread is allowed to execute in its critical section

Solution to Critical-Section Problem



Mutual Exclusion

–

Only one thread at a time is allowed into its CS, among all threads that have CS for the same resource or shared data

–

A thread halted in its non-critical section must not interfere with other threads



Progress

–

A thread remains inside CS for a finite time only

–

No assumptions concerning relative speed of the threads

Solution to Critical-Section Problem



Bounded Waiting

–

It must no be possible for a thread requiring access to a critical section to be delayed indefinitely

–

When no thread is in a critical section, any thread that requests entry must be permitted to enter without delay

Initial Attempts to Solve Problem



Only 2 threads, T 0 and T 1



General structure of thread T

(other thread T

) do {

enter section

critical section

exit section

} while (1); reminder section



Threads may share some common variables to synchronize their actions

First Attempt: Algorithm 1



Shared variables

–

Initialization: int turn = 0;

–

turn == i



T i

can enter its critical section



Thread T

do { while (turn != i) ; critical section turn = j; reminder section } while (1);



Satisfies mutual exclusion, but not progress

Second Attempt: Algorithm 2



Shared variables

–

initialization: int flag[2]; flag[0] = flag[1] = 0;

–

flag[i] == 1



T i can enter its critical section



Thread T i do { flag[i] = 1; while (flag[j] == 1) ; critical section flag[i] = 0; remainder section } while(1);



Satisfies mutual exclusion, not progress requirement

Algorithm 3

(Peterson ’s Algorithm - 1981)



Shared variables of algorithms 1 and 2 - initialization: int flag[2]; flag[0] = flag[1] = 0; int turn = 0;



Thread T i do { flag[i] = 1; turn = j; while ((flag[j] == 1) && turn == j) ; critical section flag[i] = 0; } while (1); remainder section



Solves the critical-section problem for two threads

Dekker ’s Algorithm (1965)



This is the first correct solution proposed for the two-thread (two-process) case



Originally developed by Dekker in a different context, it was applied to the critical section problem by Dijkstra.



Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested.



When there is a conflict, one thread is favored, and the priority reverses after successful execution of the critical section

Dekker ’s Algorithm (contd.)



Shared variables - initialization: int flag[2]; flag[0] = flag[1] = 0; int turn = 0;



Thread T i do { flag[i] = 1; while (flag[j] ) if (turn == j) { flag[i] = 0; while (turn == j); flag[i] = 1; } critical section turn = j; flag[I] = 0;; remainder section } while (1);

Bakery Algorithm

(Lamport 1979)



A Solution to the Critical Section problem for n threads



Before entering its CS, a thread receives a number



Holder of the smallest number enters the CS.



If threads T i and T j if i < j, then T i receive the same number, is served first; else T j is served first.



The numbering scheme generates numbers in monotonically non-decreasing order:

–

i.e., 1,1,1,2,3,3,3,4,4,5...

Bakery Algorithm



Notation “<“ establishes lexicographical order among 2-tuples (ticket #, thread id #) (a,b) < (c,d) if a < c or if a == c and b < d max (a

, …, a n-1 ) = { k | k



i for i = 0, …, n – 1 }



Shared data int choosing[n]; int number[n]; - the ticket Data structures are initialized to 0

Bakery Algorithm

do { choosing[i] = 1; number[i] = max(number[0],number[1] ...,number[n-1]) + 1; choosing[i] = 0; for (j = 0; j < n; j++) { while (choosing[j] == 1) ; while ((number[j] != 0) && ((number[j],j) ‘’ < ‘’ (number[i],i))); } critical section number[i] = 0; remainder section } while (1);

Mutual Exclusion -

Hardware Support



Interrupt Disabling

–

Concurrent threads cannot overlap on a uniprocessor

–

Thread will run until performing a system call or interrupt happens



Special Atomic Machine Instructions

–

Test and Set Instruction - read & write a memory location

–

Exchange Instruction - swap register and memory location



Problems with Machine-Instruction Approach

–

Busy waiting

–

Starvation is possible

–

Deadlock is possible

Synchronization Hardware



Test and modify the content of a word atomically boolean TestAndSet(boolean &target) { boolean rv = target; target = true; } return rv;

Mutual Exclusion with Test-and-Set



Shared data:

–

boolean lock = false;



Thread T

do { while (TestAndSet(lock)) ; critical section lock = false; remainder section }

Synchronization Hardware



Atomically swap two variables void Swap(boolean &a, boolean &b) { boolean temp = a; a = b; b = temp; }

Mutual Exclusion with Swap



Shared data (initialized to

): int lock = 0;



Thread T

int key; do { key = 1; while (key == 1) Swap(lock,key); critical section lock = 0; remainder section }

Semaphores



Semaphore S – integer variable



can only be accessed via two atomic operations wait (S): while (S <= 0); S--; signal (S):

S++;

Critical Section of n Threads



Shared data: semaphore mutex; //initially mutex = 1



Thread T i : do { wait(mutex); critical section signal(mutex); remainder section } while (1);

Semaphore Implementation



Semaphores may suspend/resume threads

–

Avoid busy waiting



Define a semaphore as a record typedef struct { int value; struct thread *L; } semaphore;



Assume two simple operations:

–

suspend() suspends the thread that invokes it

–

resume(T) resumes the execution of a blocked thread T

Implementation



Semaphore operations now defined as wait(S): S.value--; if (S.value < 0) { add this thread to S.L; suspend(); } signal(S): S.value++; if (S.value <= 0) { remove a thread T from S.L; resume(T); }

Semaphore as a General Synchronization Tool



Execute B in T j only after A executed in T



Use semaphore flag initialized to 0



Code:

T i

 

T j A

signal(flag) wait(flag)

Two Types of Semaphores



Counting semaphore

–

integer value can range over an unrestricted domain.



Binary semaphore

–

integer value can range only between 0 and 1;

–

can be simpler to implement.



Counting semaphore S can be implemented as a binary semaphore

Deadlock and Starvation



Deadlock –

–

two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads



Let S and Q be two semaphores initialized to 1

T 0

wait(S); wait(Q);

 

signal(S); signal(Q)

T 1

wait(Q); wait(S); signal(Q); signal(S);

Deadlock and Starvation



Starvation – indefinite blocking

–

A thread may never be removed from the semaphore queue in which it is suspended.



Solution –

–

all code should acquire/release semaphores in same order

Windows Synchronization



Uses interrupt masks to protect access to global resources on uniprocessor systems.



Uses spinlocks on multiprocessor systems.



Provides dispatcher objects which may act as mutexes and semaphores.



Dispatcher objects may also provide events. An event acts much like a condition variable

Linux Synchronization



Kernel disables interrupts for synchronizing access to global data on uniprocessor systems.



Uses spinlocks for multiprocessor synchronization.



Uses semaphores and readers-writers locks when longer sections of code need access to data.



Implements POSIX synchronization primitives to support multitasking, multithreading (including real-time threads), and multiprocessing

3.2. Trap Dispatching, Interrupts, Synchronization



Trap and Interrupt dispatching



IRQL levels & Interrupt Precedence



Spinlocks and Kernel Synchronization



Executive Synchronization

Kernel Mode Versus User Mode



A processor state



Controls access to memory



Each memory page is tagged to show the required mode for reading and for writing

–

Protects the system from the users

–

Protects the user (process) from themselves

–

System is not protected from system



Code regions are tagged “no write in any mode”



Controls ability to execute privileged instructions



A Windows abstraction

–

Intel: Ring 0, Ring 3

Kernel Mode Versus User Mode



Control flow (a thread) can change from user to kernel mode and back

–

Does not affect scheduling

–

Thread context includes info about execution mode (along with registers, etc)



PerfMon counters:

–

“Privileged Time” and “User Time”

–

4 levels of granularity: thread, process, processor, system

Getting Into Kernel Mode



Code is run in kernel mode for one of three reasons:



1. Requests from user mode

–

Via the system service dispatch mechanism

–

Kernel-mode code runs in the context of the requesting thread



2. Dedicated kernel-mode system threads

–

Some threads in the system stay in kernel mode at all times



mostly in the “System” process

–

Scheduled, preempted, etc., like any other threads

Getting Into Kernel Mode

3. Interrupts from external devices

–

interrupt dispatcher invokes the interrupt service routine

–

ISR runs in the context of the interrupted thread



so-called “arbitrary thread context”

–

ISR often requests the execution of a “DPC routine,” which also runs in kernel mode

–

Time not charged to interrupted thread

Trap dispatching



Trap: processor ‘s mechanism to capture executing thread

–

Switch from user to kernel mode

–

Interrupts – asynchronous

–

Exceptions - synchronous Interrupt Interrupt dispatcher Interrupt service service routines System service call System service dispatcher System services services HW exceptions SW exceptions Exception dispatcher handlers handlers Virtual address exceptions Virtual memory manager‘s pager

Interrupt Dispatching

user or kernel mode code kernel mode interrupt !

Interrupt dispatch routine Disable interrupts Record machine state (trap frame) to allow resume Mask equal- and lower-IRQL interrupts Find and call appropriate ISR Dismiss interrupt Restore machine state (including mode and enabled interrupts) Note, no thread or process context switch!

Interrupt service routine Tell the device to stop interrupting Interrogate device state, start next operation on device, etc. Request a DPC Return to caller

Interrupt Precedence via IRQLs (x86)



IRQL = Interrupt Request Level

–

Precedence of the interrupt with respect to other interrupts

–

Different interrupt sources have different IRQLs

–

not the same as IRQ



IRQL is also a state of the processor

–

Servicing an interrupt raises processor IRQL to that interrupt ’s IRQL

–

this masks subsequent interrupts at equal and lower IRQLs



User mode is limited to IRQL 0



No waits or page faults at IRQL >= DISPATCH_LEVEL

Interrupt Precedence via IRQLs (x86)

31 30 29 28 2 1 0 High Power fail Interprocessor Interrupt Clock Profile & Synch (Srv 2003) .

Device 1 Dispatch/DPC APC Passive/Low Hardware interrupts Deferrable software interrupts normal thread execution

Interrupt processing



Interrupt dispatch table (IDT)

–

Links to interrupt service routines



x86:

–

Interrupt controller interrupts processor (single line)

–

Processor queries for interrupt vector; uses vector as index to IDT



Alpha:

–

PAL code (Privileged Architecture Library – Alpha BIOS) determines interrupt vector, calls kernel

–

Kernel uses vector to index IDT



After ISR execution, IRQL is lowered to initial level

Interrupt object



Allows device drivers to register ISRs for their devices

–

Contains dispatch code (initial handler)

–

Dispatch code calls ISR with interrupt object as parameter (HW cannot pass parameters to ISR)



Connecting/disconnecting interrupt objects:

–

Dynamic association between ISR and IDT entry

–

Loadable device drivers (kernel modules)

–

Turn on/off ISR



Interrupt objects can synchronize access to ISR data

–

Multiple instances of ISR may be active simultaneously (MP machine)

–

Multiple ISR may be connected with IRQL

Predefined IRQLs

 High –

used when halting the system (via KeBugCheck())

 Power fail –

originated in the NT design document, but has never been used

 Inter-processor interrupt –

used to request action from other processor (dispatching a thread, updating a processors TLB, system shutdown, system crash)

 Clock –

Used to update system ‘s clock, allocation of CPU time to threads

 Profile –

Used for kernel profiling (see Kernel profiler – Kernprof.exe, Res Kit)

Predefined IRQLs (contd.)

 Device –

Used to prioritize device interrupts

 DPC/dispatch

and

APC –

Software interrupts that kernel and device drivers generate

 Passive –

No interrupt level at all, normal thread execution

15 14 13 12 4 3 2 1 0

IRQLs on 64-bit Systems

x64 High/Profile Interprocessor Interrupt/Power Clock Synch (Srv 2003) Device n .

Device 1 Dispatch/DPC APC Passive/Low IA64 High/Profile/Power Interprocessor Interrupt Clock Synch (MP only) Device n .

Device 1 Correctable Machine Check Dispatch/DPC & Synch (UP only) APC Passive/Low

Interrupt Prioritization & Delivery



IRQLs are determined as follows:

–

x86 UP systems: IRQL = 27 - IRQ

–

x86 MP systems: bucketized (random)

–

x64 & IA64 systems: IRQL = IDT vector number / 16



On MP systems, which processor is chosen to deliver an interrupt?

–

By default, any processor can receive an interrupt from any device



Can be configured with IntFilter utility in Resource Kit

–

On x86 and x64 systems, the IOAPIC (I/O advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL

–

On IA64 systems, the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt source



Processors are assigned round robin for each interrupt vector

Software interrupts



Initiating thread dispatching

–

DPC allow for scheduling actions when kernel is deep within many layers of code

–

Delayed scheduling decision, one DPC queue per processor



Handling timer expiration



Asynchronous execution of a procedure in context of a particular thread



Support for asynchronous I/O operations

Flow of Interrupts

Synchronization on SMP Systems



Sync on MP use spinlocks to coordinate among processors



Spinlock acquisition and release routines implement a one owner-at-a-time algorithm

–

Spinlock is either free or is considered to be owned by a CPU

–

Analogous to using Windows API mutexes from user mode



A spinlock is just a data cell in memory

–

Accessed with a test-and-modify operation that is atomic across all processors

–

KSPIN_LOCK is an opaque data type, typedef ’d as a ULONG

–

To implement synchronization, a single bit is sufficient

Kernel Synchronization

Processor A

do acquire_spinlock(DPC) until (SUCCESS) begin remove DPC from queue end release_spinlock(DPC) spinlock DPC DPC

Processor B

do acquire_spinlock(DPC) until (SUCCESS) begin remove DPC from queue end release_spinlock(DPC) Critical section A spinlock is a locking primitive associated with a global data structure, such as the DPC queue

Queued Spinlocks



Problem: Checking status of spinlock via test-and-set operation creates bus contention



Queued spinlocks maintain queue of waiting processors



First processor acquires lock; other processors wait on processor-local flag

–

Thus, busy-wait loop requires no access to the memory bus



When releasing lock, the 1st processor ’s flag is modified

–

Exactly one processor is being signaled

–

Pre-determined wait order

SMP Scalability Improvements



Windows 2000: queued spinlocks

–

!qlocks in Kernel Debugger



Server 2003:

–

More spinlocks eliminated (context swap, system space, commit)

–

Further reduction of use of spinlocks & length they are held

–

Scheduling database now per-CPU



Allows thread state transitions in parallel

SMP Scalability Improvements



XP/2003:

–

Minimized lock contention for hot locks



PFN or Page Frame Database lock

–

Some locks completely eliminated



Charging nonpaged/paged pool quotas,



allocating and mapping system page table entries,



charging commitment of pages,



allocating/mapping physical memory through AWE functions

–

New, more efficient locking mechanism (pushlocks)



Doesn ’t use spinlocks when no contention



Used for object manager and address windowing extensions (AWE) related locks

Waiting



Flexible wait calls

–

Wait for one or multiple objects in one call

–

Wait for multiple can wait for “any” one or “all” at once



“All”: all objects must be in the signalled state concurrently to resolve the wait

–

All wait calls include optional timeout argument

–

Waiting threads consume no CPU time

Waiting



Waitable objects include:

–

Events (may be auto-reset or manual reset; may be set or “pulsed”)

–

Mutexes ( “mutual exclusion”, one-at-a-time)

–

Semaphores (n-at-a-time)

–

Timers

–

Processes and Threads (signalled upon exit or terminate)

–

Directories (change notification)

Waiting



No guaranteed ordering of wait resolution

–

If multiple threads are waiting for an object, and only one thread is released (e.g. it ’s a mutex or auto-reset event), which thread gets released is unpredictable

Executive Synchronization



Waiting on Dispatcher Objects – outside the kernel

Create and initialize thread object Terminated Thread waits on an object handle Initialized Waiting Wait is complete; Set object to signaled state Ready Transition Standby Running Interaction with thread scheduling

Interaction bet Synchronization & Dispatching



User mode thread waits on an event object ‘s handle



Kernel changes thread ‘s scheduling state from ready to waiting and adds thread to wait-list



Another thread sets the event



Kernel wakes up waiting threads; variable priority threads get priority boost

Interaction bet Synchronization & Dispatching



Dispatcher re-schedules new thread – it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch



If no processor can be preempted, the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later

What signals an object?

Dispatcher object System events and resulting state change Effect of signaled state Owning thread releases mutex Mutex (kernel mode) nonsignaled signaled Resumed thread acquires mutex Owning thread or other thread releases mutex Mutex (exported to user mode) nonsignaled signaled Resumed thread acquires mutex on waiting threads Kernel resumes one waiting thread Kernel resumes one waiting thread Semaphore One thread releases the semaphore, freeing a resource nonsignaled signaled A thread acquires the semaphore.

More resources are not available Kernel resumes one or more waiting threads

What signals an object? (contd.)

Dispatcher object Event Event pair Timer System events and resulting state change A thread sets the event nonsignaled Kernel resumes one or more threads Dedicated thread sets one event in the event pair signaled nonsignaled signaled Kernel resumes the other dedicated thread Timer expires Effect of signaled state on waiting threads Kernel resumes one or more waiting threads Kernel resumes waiting dedicated thread nonsignaled signaled Kernel resumes all waiting threads A thread (re) initializes the timer Thread terminates Thread nonsignaled signaled Kernel resumes all waiting threads A thread reinitializes the thread object

3.3. Advanced Windows Synchronization



Deferred and Asynchronous Procedure Calls



IRQLs and CPU Time Accounting



Wait Queues & Dispatcher Objects

Deferred Procedure Calls (DPCs)



Used to defer processing from higher (device) interrupt level to a lower (dispatch) level

–

Also used for quantum end and timer expiration



Driver (usually ISR) queues request

–

One queue per CPU. DPCs are normally queued to the current processor, but can be targeted to other CPUs

–

Executes specified procedure at dispatch IRQL (or “dispatch level ”, also “DPC level”) when all higher-IRQL work (interrupts) completed

–

Maximum times recommended: ISR: 10 usec, DPC: 25 usec



See http://www.microsoft.com/whdc/driver/perform/mmdrv.mspx

Deferred Procedure Calls (DPCs)

queue head DPC object DPC object DPC object

Delivering a DPC

DPC routines can‘t assume what process address space is currently mapped DPC 1. Timer expires, kernel queues DPC that will release all waiting threads Kernel requests SW int.

high Power failure Interrupt dispatch table 2. DPC interrupt occurs when IRQL drops below dispatch/DPC level DPC DPC DPC DPC queue Dispatch/DPC APC Low 3. After DPC interrupt, control transfers to thread dispatcher dispatcher DPC routines can call kernel functions but can‘t call system services, generate page faults, or create or wait on objects 4. Dispatcher executes each DPC routine in DPC queue

Asynchronous Procedure Calls (APCs)



Execute code in context of a particular user thread

–

APC routines can acquire resources (objects), incur page faults, call system services



APC queue is thread-specific



User mode & kernel mode APCs

–

Permission required for user mode APCs

Asynchronous Procedure Calls (APCs)



Executive uses APCs to complete work in thread space

–

Wait for asynchronous I/O operation

–

Emulate delivery of POSIX signals

–

Make threads suspend/terminate itself (env. subsystems)



APCs are delivered when thread is in alertable wait state

–

WaitForMultipleObjectsEx(), SleepEx()

Asynchronous Procedure Calls

(APCs)



Special kernel APCs

–

Run in kernel mode, at IRQL 1

–

Always deliverable unless thread is already at IRQL 1 or above

–

Used for I/O completion reporting from “arbitrary thread context”

–

Kernel-mode interface is linkable, but not documented

Asynchronous Procedure Calls

(APCs)



“Ordinary” kernel APCs

–

Always deliverable if at IRQL 0, unless explicitly disabled (disable with KeEnterCriticalRegion) User mode APCs

–

Used for I/O completion callback routines (see ReadFileEx, WriteFileEx); also, QueueUserApc

–

Only deliverable when thread is in “alertable wait”

Asynchronous Procedure Calls

(APCs) Thread Object K U APC objects

IRQLs and CPU Time Accounting



Interval clock timer ISR keeps track of time



Clock ISR time accounting:

–

If IRQL<2, charge to thread ’s user or kernel time

–

If IRQL=2 and processing a DPC, charge to DPC time

–

If IRQL=2 & not processing a DPC, charge to thread kernel time

–

If IRQL>2, charge to interrupt time

IRQLs and CPU Time Accounting



Since time servicing interrupts are NOT charged to interrupted thread, if system is busy but no process appears to be running, must be due to interrupt-related activity

–

Note: time at IRQL 2 or more is charged to the current thread ’s quantum (to be described)

Interrupt Time Accounting



Task Manager includes interrupt and DPC time with the Idle process time



Interrupt activity is not charged to any thread/process

–

Process Explorer shows these as separate processes



not really processes

–

Context switches for these are really # of interrupts & DPCs

Time Accounting Quirks



Looking at total CPU time for each process may not reveal where system has spent its time



CPU time accounting is driven by programmable interrupt timer

–

Normally 10 msec (15 msec on some MP Pentiums)



Thread execution and context switches between clock intervals NOT accounted

–

E.g., one or more threads run and enter a wait state before clock fires

–

Thus threads may run but never get charged



View context switch activity with Process Explorer

–

Add Context Switch Delta column

Looking at Waiting Threads



For waiting threads, user-mode utilities only display the wait reason



Example: pstat

Wait Internals 1: Dispatcher Objects



Any kernel object you can wait for is a “dispatcher object”

–

some exclusively for synchronization



e.g. events, mutexes ( “mutants”), semaphores, queues, timers

–

others can be waited for as a side effect of their prime function



e.g. processes, threads, file objects

–

non-waitable kernel objects are called “control objects”



All dispatcher objects have a common header



All dispatcher objects are in one of two states Dispatcher Object

–

“signaled” vs. “nonsignaled”

–

when signalled, a wait on the object is satisfied Size State Type Wait listhead

–

different object types differ in terms of what changes their state

–

wait and unwait implementation is common to all types of dispatcher objects Object-type specific data (see \ntddk\inc\ddk\ntddk.h)

WaitBlockList Dispatcher Objects Size State Type Wait listhead Object-type specific data Size State Type Wait listhead Object-type specific data Thread Objects WaitBlockList Wait blocks List entry Thread Object Key Type Next link List entry Thread Object Key Type Next link Wait Internals 2:

Wait Blocks

   

Represent a thread ’s reference to something it’s waiting for (one per handle passed to WaitFor …) All wait blocks from a given wait call are chained to the waiting thread Type indicates wait for “any” or “all” Key denotes argument list position for WaitForMultipleObjects List entry Thread Object Key Type Next link

3.4. Windows APIs for Synchronization and IPC



Windows API constructs for synchronization and interprocess communication



Synchronization

–

Critical sections

–

Mutexes

–

Semaphores

–

Event objects



Synchronization through interprocess communication

–

Anonymous pipes

–

Named pipes

–

Mailslots

Critical Sections

VOID InitializeCriticalSection( LPCRITICAL_SECTION sec ); VOID DeleteCriticalSection( LPCRITICAL_SECTION sec ); VOID EnterCriticalSection( LPCRITICAL_SECTION sec ); VOID LeaveCriticalSection( LPCRITICAL_SECTION sec ); BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec );

Only usable from within the same process



Critical sections are initialized and deleted but do not have handles



Only one thread at a time can be in a critical section



A thread can enter a critical section multiple times - however, the number of Enter- and Leave-operations must match



Leaving a critical section before entering it may cause deadlocks



No way to test whether another thread is in a critical section

Critical Section Example

/* counter is global, shared by all threads */ volatile int counter = 0; CRITICAL_SECTION crit; InitializeCriticalSection ( &crit ); /* … main loop in any of the threads */ while (!done) { _try { EnterCriticalSection ( &crit ); counter += local_value; LeaveCriticalSection ( &crit ); } _finally { LeaveCriticalSection ( &crit ); } } DeleteCriticalSection( &crit );

Synchronizing Threads with Kernel Objects

DWORD WaitForSingleObject( HANDLE hObject, DWORD dwTimeout ); DWORD WaitForMultipleObjects( DWORD cObjects, LPHANDLE lpHandles, BOOL bWaitAll, DWORD dwTimeout );

The following kernel objects can be used to synchronize threads:

–

Processes

–

Threads

–

Files

File change notifications Mutexes –

Console input

Events (auto-reset + manual-reset) Waitable timers

Wait Functions - Details



WaitForSingleObject():

–

hObject specifies kernel object

–

dwTimeout specifies wait time in msec



dwTimeout == 0 - no wait, check whether object is signaled



dwTimeout == INFINITE - wait forever



WaitForMultipleObjects():

–

cObjects <= MAXIMUM_WAIT_OBJECTS (64)

–

lpHandles - pointer to array identifying these objects

–

bWaitAll - whether to wait for first signaled object or all objects



Function returns index of first signaled object



Side effects:

–

Mutexes, auto-reset events and waitable timers will be reset to non signaled state after completing wait functions

Mutexes

Mutexes work across processes



First thread has to call CreateMutex()



When sharing a mutex, second thread (process) calls CreateMutex() or OpenMutex()



fInitialOwner == TRUE gives creator immediate ownership



Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()



ReleaseMutex() gives up ownership



CloseHandle() will free mutex object

Mutexes

HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsa, BOOL fInitialOwner, LPTSTR lpszMutexName ); HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsa, BOOL fInitialOwner, LPTSTR lpszMutexName ); BOOL ReleaseMutex( HANDLE hMutex );

Mutex Example

/* counter is global, shared by all threads */ volatile int done, counter = 0; HANDLE mutex = CreateMutex( NULL, FALSE, NULL ); /* main loop in any of the threads, ret is local */ DWORD ret; while (!done) { ret = WaitForSingleObject( mutex, INFINITE ); if (ret == WAIT_OBJECT_0) counter += local_value; else /* mutex was abandoned */ break; /* exit the loop */ ReleaseMutex( mutex ); } CloseHandle( mutex );

Comparison - POSIX mutexes



POSIX pthreads specification supports mutexes

–

Synchronization among threads in same process



Five basic functions:

–

pthread_mutex_init()

–

pthread_mutex_destroy()

–

pthread_mutex_lock()

–

pthread_mutex_unlock()

–

pthread_mutex_trylock()



Comparison:

–

pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex );

–

pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0

Semaphores



Semaphore objects are used for resource counting

–

A semaphore is signaled when count > 0



Threads/processes use wait functions

–

Each wait function decreases semaphore count by 1

–

ReleaseSemaphore() may increment count by any value

–

ReleaseSemaphore() returns old semaphore count

Semaphores

HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsa, LONG cSemInit, LONG cSemMax, LPTSTR lpszSemName ); HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsa, LONG cSemInit, LONG cSemMax, LPTSTR lpszSemName ); HANDLE ReleaseSemaphore( HANDLE hSemaphore, LONG cReleaseCount, LPLONG lpPreviousCount );

Events



Multiple threads can be released when a single event is signaled (barrier synchronization)

–

Manual-reset event can signal several thread simultaneously; must be reset manually

–

PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event

–

Auto-reset event signals a single thread; event is reset automatically

–

fInitialState == TRUE - create event in signaled state

Events

HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsa, BOOL fManualReset, BOOL fInititalState LPTSTR lpszEventName ); BOOL SetEvent( HANDLE hEvent ); BOOL ResetEvent( HANDLE hEvent ); BOOL PulseEvent( HANDLE hEvent );

Comparison - POSIX condition variables



pthread ’s condition variables are comparable to events

–

pthread_cond_init()

–

pthread_cond_destroy()



Wait functions:

–

pthread_cond_wait()

–

pthread_cond_timedwait()



Signaling:

–

pthread_cond_signal() - one thread

–

pthread_cond_broadcast() - all waiting threads



No exact equivalent to manual-reset events

Anonymous pipes

Half-duplex character-based IPC



cbPipe: pipe byte size; zero == default



Read on pipe handle will block if pipe is empty



Write operation to a full pipe will block



Anonymous pipes are oneway

BOOL CreatePipe( PHANDLE phRead, PHANDLE phWrite, LPSECURITY_ATTRIBUTES lpsa, DWORD cbPipe ) main prog1 pipe prog2

I/O Redirection using an Anonymous Pipe

/* Create default size anonymous pipe, handles are inheritable. */ if (!CreatePipe (&hReadPipe, &hWritePipe, &PipeSA, 0)) { fprintf(stderr, “ Anon pipe create failed\n ” ); exit(1); } /* Set output handle to pipe handle, create first processes. */ StartInfoCh1.hStdInput = GetStdHandle (STD_INPUT_HANDLE); StartInfoCh1.hStdError = GetStdHandle (STD_ERROR_HANDLE); StartInfoCh1.hStdOutput = hWritePipe; StartInfoCh1.dwFlags = STARTF_USESTDHANDLES; if (!CreateProcess (NULL, (LPTSTR)Command1, NULL, NULL, TRUE, 0, NULL, NULL, &StartInfoCh1, &ProcInfo1)) { fprintf(stderr, “ CreateProc1 failed\n ” ); exit(2); } CloseHandle (hWritePipe);

Pipe example (contd.)

/* Repeat (symmetrically) for the second process. */ StartInfoCh2.hStdInput = hReadPipe; StartInfoCh2.hStdError = GetStdHandle (STD_ERROR_HANDLE); StartInfoCh2.hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE); StartInfoCh2.dwFlags = STARTF_USESTDHANDLES; if (!CreateProcess (NULL, (LPTSTR)targv, NULL, NULL,TRUE,/* Inherit handles. */ 0, NULL, NULL, &StartInfoCh2, &ProcInfo2)) { fprintf(stderr, “ CreateProc2 failed\n ” ); exit(3); } CloseHandle (hReadPipe); /* Wait for both processes to complete. */ WaitForSingleObject (ProcInfo1.hProcess, INFINITE); WaitForSingleObject (ProcInfo2.hProcess, INFINITE); CloseHandle (ProcInfo1.hThread); CloseHandle (ProcInfo1.hProcess); CloseHandle (ProcInfo2.hThread); CloseHandle (ProcInfo2.hProcess); return 0;

Named Pipes



Message oriented:

–

Reading process can read varying-length messages precisely as sent by the writing process



Bi-directional

–

Two processes can exchange messages over the same pipe



Multiple, independent instances of a named pipe:

–

Several clients can communicate with a single server using the same instance

–

Server can respond to client using the same instance



Pipe can be accessed over the network

–

location transparency



Convenience and connection functions

Using Named Pipes



lpszPipeName: \\.\pipe\[path]pipename

–

Not possible to create a pipe on remote machine (. – local machine)



fdwOpenMode:

–

PIPE_ACCESS_DUPLEX, PIPE_ACCESS_INBOUND, PIPE_ACCESS_OUTBOUND



fdwPipeMode:

–

PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE

–

PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE

–

PIPE_WAIT or PIPE_NOWAIT (will ReadFile block?)

all instances of a named pipe HANDLE CreateNamedPipe (LPCTSTR lpszPipeName, DWORD fdwOpenMode, DWORD fdwPipMode DWORD nMaxInstances, DWORD cbOutBuf, DWORD cbInBuf, DWORD dwTimeOut, LPSECURITY_ATTRIBUTES lpsa );

Named Pipes (contd.)



nMaxInstances:

–

Number of instances,

–

PIPE_UNLIMITED_INSTANCES: OS choice based on resources



dwTimeOut

–

Default time-out period (in msec) for WaitNamedPipe()



First CreateNamedPipe creates named pipe

–

Closing handle to last instance deletes named pipe



Polling a pipe:

–

Nondestructive – is there a message waiting for ReadFile

BOOL PeekNamedPipe (HANDLE hPipe, LPVOID lpvBuffer, DWORD cbBuffer, LPDWORD lpcbRead, LPDWORD lpcbAvail, LPDWORD lpcbMessage);

Named Pipe Client Connections



CreateFile with named pipe name:

–

\\.\pipe\[path]pipename

–

\\servername\pipe\[path]pipename

–

First method gives better performance (local server)



Status Functions:

–

GetNamedPipeHandleState

–

SetNamedPipeHandleState

–

GetNamedPipeInfo

Convenience Functions



WriteFile / ReadFile sequence:

BOOL TransactNamedPipe( HANDLE hNamedPipe, LPVOID lpvWriteBuf, DWORD cbWriteBuf, LPVOID lpvReadBuf, DWORD cbReadBuf, LPDOWRD lpcbRead, LPOVERLAPPED lpa);

Convenience Functions

 CreateFile / WriteFile / ReadFile / CloseHandle: – dwTimeOut: NMPWAIT_NOWAIT, NMPWAIT_WIAT_FOREVER, NMPWAIT_USE_DEFAULT_WAIT

BOOL CallNamedPipe( LPCTSTR lpszPipeName, LPVOID lpvWriteBuf, DWORD cbWriteBuf, LPVOID lpvReadBuf, DWORD cbReadBuf, LPDWORD lpcbRead, DWORD dwTimeOut);

Server: eliminate the polling loop



lpo == NULL:

–

Call will return as soon as there is a client connection

–

Returns false if client connected between CreateNamed Pipe call and ConnectNamedPipe()



Use DisconnectNamedPipe to free the handle for connection from another client



WaitNamedPipe():

–

Client may wait for server

BOOL ConnectNamedPipe (HANDLE hNamedPipe,

‘s ConnectNamedPipe()



Security rights for named pipes:

–

GENERIC_READ, GENERIC_WRITE, SYNCHRONIZE

Comparison with UNIX



UNIX FIFOs are similar to a named pipe

–

FIFOs are half-duplex

–

FIFOs are limited to a single machine

–

FIFOs are still byte-oriented, so its easiest to use fixed-size records in client/server applications

–

Individual read/writes are atomic



A server using FIFOs must use a separate FIFO for each client ‘s response, although all clients can send requests via a single, well known FIFO



Mkfifo() is the UNIX counterpart to CreateNamedPipe()



Use sockets for networked client/server scenarios

Client Example using Named Pipe

WaitNamedPipe (ServerPipeName, NMPWAIT_WAIT_FOREVER); hNamedPipe = CreateFile (ServerPipeName, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (hNamedPipe == INVALID_HANDLE_VALUE) { fptinf(stderr, Failure to locate server.\n"); exit(3); } /* Write the request. */ WriteFile (hNamedPipe, &Request, MAX_RQRS_LEN, &nWrite, NULL); /* Read each response and send it to std out. */ while (ReadFile (hNamedPipe, Response.Record, MAX_RQRS_LEN, &nRead, NULL)) printf ("%s", Response.Record); CloseHandle (hNamedPipe); return 0;

Server Example Using a Named Pipe

hNamedPipe = CreateNamedPipe (SERVER_PIPE, PIPE_ACCESS_DUPLEX, PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT, 1, 0, 0, CS_TIMEOUT, pNPSA); while (!Done) { printf ("Server is awaiting next request.\n"); if (!ConnectNamedPipe (hNamedPipe, NULL) || !ReadFile (hNamedPipe, &Request, RQ_SIZE, &nXfer, NULL)) { fprintf(stderr, “ Connect or Read Named Pipe error\n ” ); exit(4); } printf( “ Request is: %s\n", Request.Record); /* Send the file, one line at a time, to the client. */ fp = fopen (File, "r"); while ((fgets (Response.Record, MAX_RQRS_LEN, fp) != NULL)) WriteFile (hNamedPipe, &Response.Record, (strlen(Response.Record) + 1) * TSIZE, &nXfer, NULL); fclose (fp); DisconnectNamedPipe (hNamedPipe); } /* End of server operation. */

Win32 IPC - Mailslots



Broadcast mechanism:

–

One-directional

Mailslots bear some nasty implementation details; they are almost never used –

Mutliple writers/multiple readers (frequently: one-to-many comm.)

–

Message delivery is unreliable

–

Can be located over a network domain

–

Message lengths are limited (w2k: < 426 byte)



Operations on the mailslot:

–

Each reader (server) creates mailslot with CreateMailslot()

–

Write-only client opens mailslot with CreateFile() and uses WriteFile() – open will fail if there are no waiting readers

–

Client ‘s message can be read by all servers (readers)



Client lookup: \\*\mailslot\mailslotname

–

Client will connect to every server in network domain

Locate a server via mailslot

Mailslot Servers

App client 0 hMS = CreateMailslot( “\\.\mailslot\status“); ReadFile(hMS, &ServStat); /* connect to server */ App client n hMS = CreateMailslot( “\\.\mailslot\status“); ReadFile(hMS, &ServStat); /* connect to server */

Mailslot Client

Message is sent periodically App Server } While (...) { Sleep(...); hMS = CreateFile( “\\.\mailslot\status“); ...

WriteFile(hMS, &StatInfo

Creating a mailslot



lpszName points to a name of the form

–

\\.\mailslot\[path]name

–

Name must be unique; mailslot is created locally



cbMaxMsg is msg size in byte



dwReadTimeout

–

Read operation will wait for so many msec

–

0 – immediate return

–

MAILSLOT_WAIT_FOREVER – infinite wait

HANDLE CreateMailslot(LPCTSTR lpszName, DWORD cbMaxMsg, DWORD dwReadTimeout, LPSECURITY_ATTRIBUTES lpsa);

Opening a mailslot



CreateFile with the following names:

–

\\.\mailslot\[path]name - retrieve handle for local mailslot

–

\\host\mailslot\[path]name - retrieve handle for mailslot on specified host

–

\\domain\mailslot\[path]name - returns handle representing all mailslots on machines in the domain

–

\\*\mailslot\[path]name - returns handle representing mailslots on machines in the system ‘s primary domain: max mesg. len: 400 bytes

–

Client must specifiy FILE_SHARE_READ flag



GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts

Thoughts Change Life

意念改

变生活

a thread

Transcript a thread

Unit 3: Concurrency

Outline of Content

Critical Sections, Semaphores, and Monitors

The Critical-Section Problem

Solution to Critical-Section Problem

Solution to Critical-Section Problem

Initial Attempts to Solve Problem

First Attempt: Algorithm 1

Second Attempt: Algorithm 2

Algorithm 3

Dekker ’s Algorithm (1965)

Dekker ’s Algorithm (contd.)

Bakery Algorithm

Bakery Algorithm

Bakery Algorithm

Mutual Exclusion -

Synchronization Hardware

Mutual Exclusion with Test-and-Set

Synchronization Hardware

Mutual Exclusion with Swap

Semaphores

Critical Section of n Threads

Semaphore Implementation

Implementation

Semaphore as a General Synchronization Tool

Two Types of Semaphores

Deadlock and Starvation

Deadlock and Starvation

Windows Synchronization

Linux Synchronization

Further Reading

3.2. Trap Dispatching, Interrupts, Synchronization

Kernel Mode Versus User Mode

Kernel Mode Versus User Mode

Getting Into Kernel Mode

Getting Into Kernel Mode

Trap dispatching

Interrupt Dispatching

Interrupt Precedence via IRQLs (x86)

Interrupt Precedence via IRQLs (x86)

Interrupt processing

Interrupt object

Predefined IRQLs

Predefined IRQLs (contd.)

IRQLs on 64-bit Systems

Interrupt Prioritization & Delivery

Software interrupts

Flow of Interrupts

Synchronization on SMP Systems

Kernel Synchronization

Queued Spinlocks

SMP Scalability Improvements

SMP Scalability Improvements

Waiting

Waiting

Waiting

Executive Synchronization

Interaction bet Synchronization & Dispatching

Interaction bet Synchronization & Dispatching

What signals an object?

What signals an object? (contd.)

Further Reading

3.3. Advanced Windows Synchronization

Deferred Procedure Calls (DPCs)

Deferred Procedure Calls (DPCs)

Delivering a DPC

Asynchronous Procedure Calls (APCs)

Asynchronous Procedure Calls (APCs)

Asynchronous Procedure Calls

Asynchronous Procedure Calls

Asynchronous Procedure Calls

IRQLs and CPU Time Accounting

IRQLs and CPU Time Accounting

Interrupt Time Accounting

Time Accounting Quirks

Looking at Waiting Threads

Wait Internals 1: Dispatcher Objects

Wait Blocks