A Practical guide to writing multithreaded code

Download Report

Transcript A Practical guide to writing multithreaded code

Guide to multithreaded code
“All I need to know about writing multithreaded code is that I am not
smart enough to do it right.” - Terry Way
Intended audience:
• C/C++ programmers not already expert in threading.
• Visual Basic and Java programmers - it’s not all done for you!
• Managers - see the issues that must be addressed.
•Part I Overview of the issues - Managers and developers
•Part II Coding issues - Developers
A practical guide to writing
multithreaded code - Mark Bartosik
1
Part I
• An overview for managers and developers
• Performance issues are marked with a red arrow.
• Most of the other issues relate to reliability.
• Best practice advice is marked with a blue tick.
A practical guide to writing
multithreaded code - Mark Bartosik
2
Part II
• Coding issues, mostly for developers
– Support within COM (limited amount of COM+)
– Support within Visual Basic
– Support within Java
– How to use C++ effectively
• Avoiding reliability problems
• Tackling performance problems
A practical guide to writing
multithreaded code - Mark Bartosik
3
Common myths
•
•
•
•
•
Threads will make the system run faster.
We don’t need to test on SMP machines.
The chances of two threads doing X are so small.
Only C/C++ programmers need concern themselves.
We can bother about that later.
A practical guide to writing
multithreaded code - Mark Bartosik
4
The right tools
• SMP machines
• Appropriate libraries:
Dinkumware STL
Smartheap
Hoard
Threads.h++
www.dinkumware.com
www.microquill.com
www.cs.utexas.edu/users/emery/hoard/
www.roguewave.com
• Configure msvcrt optimally
www.wdj.com/archive/1105/feature.html
MSVCRT_HEAP_SELECT
A practical guide to writing
multithreaded code - Mark Bartosik
5
Why use SMP?
• You may not be targeting SMP machines, but you still
need to test with SMP machines.
A practical guide to writing
multithreaded code - Mark Bartosik
6
thread A
time
char * global = new char[1000];
global[0] = '\0';
_beginthread(thread_B);
_beginthread(thread_C);
WaitForMultipleObjects(...);
thread B
~25ms
thread C
for (int i = 0; i != x; ++i)
{
Waitxxxx();
.
do_stuff();
.
.
strcpy(global, src);
25ms is the standard "quantum" size on
NT/2000 (it can be changed)
.
.
n = strlen(global); // Bang!
.
.
A practical guide to writing
multithreaded code - Mark Bartosik
7
Why test on SMP?
• There are categories of bugs that almost never occur
on a uniprocessor machine, but are much more likely
to occur on an SMP machine. Your choose is:
– Test on a uniprocessor machine with a very varied data set,
and on all CPU speeds that you will ever run on (or vary the
quantum size from 1 to 100ms).
– Let your most important customer find the bugs, because he
has the most unusual data set.
– Test on an SMP machine.
• Which costs most now?
A practical guide to writing
multithreaded code - Mark Bartosik
8
You’re exaggerating right?
• Yes and No.
• In 25ms a huge amount of instructions can be
performed. So the chance of a thread swap occurring
in the critical place is small.
• This is a demonstration of a bug that is almost
impossible to test for on a uniprocessor machine.
Other more common classes of bug are easier to test
for without SMP, and but much easier to test for on an
SMP machine. You reduce the number of test case
iterations to find a bug with SMP.
A practical guide to writing
multithreaded code - Mark Bartosik
9
SMP benefits
• Reduced testing and/or more confidence.
• Scalability (for servers).
• More horse power. You can buy a 2 x 1GHz machine
for less than a 1.4GHz one way machine.
• Motivation - put a smile on a developer’s face and
upgrade his machine.
A practical guide to writing
multithreaded code - Mark Bartosik
10
SMP downsides
• Not everybody’s software runs on SMP.
• Hey, this just proves my point, SMP finds bugs for you!
• Badly written code can actually run slower on SMP.
A practical guide to writing
multithreaded code - Mark Bartosik
11
Why use threads?
•
•
•
•
Wait on a blocking call without hanging the UI thread.
Smooth scheduling.
Apparent performance.
Division of truly independent and parallel tasks (e.g.
background tasks, or multiple clients)
• Avoid poling.
Poling is almost never right. Pole too often and you
burn CPU power, too slow and you appear
unresponsive. Also you can completely miss events.
A practical guide to writing
multithreaded code - Mark Bartosik
12
Threads have an overhead
• Every thread that you create has an overhead. It
needs
–
–
–
–
–
–
–
–
–
kernel resources to manage it
a stack (memory, with a typical reserved size of 1MB)
it needs scheduling
it needs creating
it needs to be cleaned up after it quits
OS handles must be duplicated
it can add complexity to the program
it may compete for resources
you may need to swap between threads
• Only create a thread if there is a real need.
A practical guide to writing
multithreaded code - Mark Bartosik
13
KISS
• Keep it simple stupid!
• If you don’t have a justified requirement
for multiple threads, then you are
causing unnecessary complication.
• This talk is non-trivial and so is
threading.
A practical guide to writing
multithreaded code - Mark Bartosik
14
Advice
• Illustrate the object relationships, clearly
showing the tiers .
• Document the threading model of your
design.
• Illustrate what objects reside in what
threads or apartments.
A practical guide to writing
multithreaded code - Mark Bartosik
15
Terminology
• SMP
CPU 0
– Symmetrical
multiprocessing.
More than one CPU
in a machine, each
with equal access
to system
resources.
on board cache
CPU 1
cache sync
on board cache
external cache
Main memory
A practical guide to writing
multithreaded code - Mark Bartosik
16
Terminology
• Serialize
– To “order sequentially”. Often we need to serialize
requests (function calls). This means that if two
threads call a function foo() then we must
ensure that the thread A has fully executed foo()
before the thread B is permitted to start executing
foo().
A practical guide to writing
multithreaded code - Mark Bartosik
17
Terminology
• Thread safe lock (abbreviated to lock)
– A resource that only one thread can hold at
anyone time.
thread_safe_lock_t my_lock; // Protects data below
std::string
my_string;
std::vector<int>
my_numbers;
void foo()
{
my_lock.lock();
my_numbers.push_back(my_string.size());
my_lock.unlock();
}
A practical guide to writing
multithreaded code - Mark Bartosik
18
How locks work
thread A
1
thread B
void foo()
{
my_lock.lock();
2
my_numbers.push_back(my_string.size());
4
my_lock.unlock();
3
thread B waits
5
6
}
A practical guide to writing
multithreaded code - Mark Bartosik
19
How locks work
1) Thread A enters the function and acquires the lock.
2) Thread A gets on with executing the function.
3) Thread B enters the function but cannot acquire the
lock, so it waits until the lock is available.
4) Thread A unlocks the lock and leaves the function.
5) Thread B is now able to acquire the lock, and
execute the function.
6) Thread B unlocks the lock and leaves the function.
A practical guide to writing
multithreaded code - Mark Bartosik
20
Overview of threading problems
1) Dead locks
– This is where a thread is stops in its tracks waiting
for some event to be signaled, or some resource
to become free. But that will never happen. So the
thread waits for ever.
2) Lack of atomic calls
– It is possible for the object to be accessed whilst in
an inconsistent state, because what is logically a
single operation is split across more than one
function call.
A practical guide to writing
multithreaded code - Mark Bartosik
21
Overview of threading problems
3) Race conditions
– e.g. A sent event is generated by one thread and
an acknowledgement by another thread. It may be
possible to receive the acknowledgement before
receiving the initial sent event.
4) Data corruption
– e.g. String is copied before it is completely written.
Similar to the SMP example.
A practical guide to writing
multithreaded code - Mark Bartosik
22
Overview of threading problems
5) Indirect freezes
– The UI freezes because it is waiting on a resource
held by a background thread that is performing
blocking operation.
6) Lost locks
– This is a special case of deadlock, where a lock
has been acquired but the code forget to release
the lock. The next thread to attempt the acquire
the lock will wait for ever.
A practical guide to writing
multithreaded code - Mark Bartosik
23
Overview of threading problems
7) Resource contention
– This is where there is a very frequently used
resource that only one thread can use at a time.
The CPU can spend more time swapping threads
than it does doing the real work.
8) Programmed thread context swapping
– Typically caused by a local/remote procedure call
(lrpc) being made from thread A to thread B. This
is almost as expensive as interprocess
communication.
A practical guide to writing
multithreaded code - Mark Bartosik
24
Overview of threading problems
9) Mixed component capabilities
– Some are thread safe and some are not
10) Mixed environment expectations
– Visual Basic, C++, Java, COM, COM+ and third
party solutions may all have differing threading
expectations and solutions, creating confusion.
11) Every body has a solution
– If I had $1 for every CThread class I’ve seen 
A practical guide to writing
multithreaded code - Mark Bartosik
25
Tools
• The programming environment
– VB/Java/C++ COM COM+
•
•
•
•
Debugger
Perfmon
QSlice (quick slice)
Profiler
A practical guide to writing
multithreaded code - Mark Bartosik
26
Questions ?
End of part I
A practical guide to writing
multithreaded code - Mark Bartosik
27
Environment support
• Windows NT/2000
- far too much to cover, and mostly too low level
•
•
•
•
COM
Visual Basic
Java
C++
A practical guide to writing
multithreaded code - Mark Bartosik
28
COM support
• The problem:
– Component A was written in VB and is not thread
safe.
– Component B was written in C++ or Java and is
thread safe.
– These components cannot safely interact.
• Either the whole system must be written using one
scheme or we must devise a way to interconnect
them safely.
A practical guide to writing
multithreaded code - Mark Bartosik
29
COM support
How COM solves the problem:
Key
0-n
primary STA
Process boundary
STA
STA
STA
STA Single Threaded
1
Apartment
Thread
A practical guide to writing
multithreaded code - Mark Bartosik
30
COM support
How COM solves the problem:
0-n
Key
0-1
primary STA
1-n
STA
MTA
Process boundary
MTA Multi-threaded
Apartment
STA
STA
STA Single Threaded
1
Apartment
Thread
A practical guide to writing
multithreaded code - Mark Bartosik
31
Cross apartment calls
0-n
0-1
primary STA
1-n
MTA
STA
STA
STA
1
A
B
lrpc
Cross apartment calls are very expensive. A large amount of RPC code
gets executed, and two thread swaps have to occur. One thread swap to
execute the code in object B, and another to get back to A’s thread.
A practical guide to writing
multithreaded code - Mark Bartosik
32
COM - example 1
primary STA
dialog
line
A dialog in the primary
STA creates a line object.
The line object creates
two points.
The line object is
threading model “both”.
p1
p2
The points are threading
model “apartment”.
A practical guide to writing
multithreaded code - Mark Bartosik
33
COM example 2
STA
MTA
proxy
lrpc
filesaver
p1
lrpc
p2
p1
line
p2
The points cannot be instantiated in the MTA so a proxy
is used. This is very inefficient.
A practical guide to writing
multithreaded code - Mark Bartosik
34
Management implications
• In an ideal world:
– all developers would be expert in all languages.
– each component would be written in the most suitable language for
the domain.
• In the real world
– most developers are proficient in just one or two languages.
– resource and scheduling issues forces managers to choose the
language that the component is written in based on the available
skills.
• Result
– major inefficiencies can be built into a product.
– if you are lucky and you find the performance bottle neck you may
need to completely rewrite the component in another language.
A practical guide to writing
multithreaded code - Mark Bartosik
35
More inefficiencies
MTA
STA
triangle
line
triangle
lrpc
line
line
A practical guide to writing
multithreaded code - Mark Bartosik
36
Free threaded marshaler
MTA
STA
triangle
triangle
line
line
line
line agregrates
free threaded
marshaler
A practical guide to writing
multithreaded code - Mark Bartosik
37
Free threaded marshaler
MTA
STA
triangle
triangle
line
p
line
line
p
line agregrates
free threaded
marshaler
p
p
A practical guide to writing
multithreaded code - Mark Bartosik
p
p
STA
38
Free threaded marshaler
MTA
STA
triangle
triangle
line
p
line
line
p
line agregrates
free threaded bang!
marshaler
p
p
A practical guide to writing
multithreaded code - Mark Bartosik
p
p
STA
39
One Solution
• NT 4.0 SP3 introduced the GIT (global interface
table). You store a “handle to an object” rather than a
direct pointer. Before accessing an object you can
give the GIT the handle and ask it for a valid pointer
(to a proxy).
• GITPtr - see code handout.
– If anyone wants to use this, I can talk about it separately.
A practical guide to writing
multithreaded code - Mark Bartosik
40
COM support summary
• For all this work COM has only solved a couple of
problems:
* It has directly solved problem (9) mixed
component capabilities.
* Because it is a standard on Microsoft platforms it
indirectly solves problem (10) mixed environment
expectations.
• However, it solves these problems at the cost of
problem (8) thread context swapping. But not with
thread neutral model.
A practical guide to writing
multithreaded code - Mark Bartosik
41
Enter COM+
• Windows 2000 only.
• Much of COM+ is about transaction processing.
• In addition to a “threading model” components can
now have a “synchronization” attribute.
• There is a finer grain of isolation of incompatible
code, each apartment can have one or more
contexts.
• Thread neutral apartment.
• I haven’t used it.
A practical guide to writing
multithreaded code - Mark Bartosik
42
Thread neutral apartment
• COM+ only.
• Now Microsoft’s recommended
threading model for non-UI code.
Seriously consider it when targeting
Windows 2000.
• Like FTM but without the problems. No
need for GITPtr.
A practical guide to writing
multithreaded code - Mark Bartosik
43
COM+ synchronization
• Disabled
– Just like COM
• Not Supported
– Objects make no promises, it is not synchronized. (These ought to
be apartment threaded.)
• Supported
– Objects provide their own (custom) synchronization.
• Required
– Requires COM+ to provide the synchronization. Even when it’s not
needed.
• Requires New
– ?
A practical guide to writing
multithreaded code - Mark Bartosik
44
Questions?
• Next we’ll see how Visual Basic fits into
the COM framework.
A practical guide to writing
multithreaded code - Mark Bartosik
45
Theme
• A recurring theme in writing robust software is
to restrict things as much as you can, please
watch out for it.
– Prefer private over public
– Prefer variables in this order:
• automatics, parameters, members, globals
– Prefer const over non-const
– The more you expose the more there is to go
wrong.
A practical guide to writing
multithreaded code - Mark Bartosik
46
VB variables
Sub foo()
dim x as Integer
x = Something
End Sub
• This variable does exactly what you expect. It has the life time of
the function call. It is only visible within the function.
• See handout “vb_rules”.
A practical guide to writing
multithreaded code - Mark Bartosik
47
VB variables
Private m_credits_earned as Integer
Sub Grade()
If m_credits_earned > 50 Then
call TopGrade
End If
End Sub
• The code excerpt comes from a VB ‘student’ class.
• This variable does exactly what you expect. The life time of
m_credits_earned is the same as the lifetime of the student.
Each separate student has a different instance of
m_credits_earned.
A practical guide to writing
multithreaded code - Mark Bartosik
48
VB variables
Sub Bar()
static is_recursive as boolean
If is_recursive Then
Exit Sub
End If
is_recursive = true
call DoStuff ‘ beware DoStuff can call Bar
is_recursive = false
End Sub
• This use of static does exactly what you expect and want.
Although it is rather different to the C++ meaning with the same
syntax. It declares a member variable only visible to the
function. Member variables have the lifetime of the form or
class, or module they are part of.
A practical guide to writing
multithreaded code - Mark Bartosik
49
VB variables
Private g_total_something as Integer
Sub Foo()
If g_total_something < 10 Then
call DoStuff
g_total_something = g_total_something + 1
End If
End Sub
• The code excerpt comes from a VB module. The behavior is
different for a form or class, and anyway you would use
m_total_something in a form or class.
• This variable does exactly what you expect and want until the
code is run in a different apartment. The code in the new
apartment uses a new instance of m_total_something and
initializes it to zero.
A practical guide to writing
multithreaded code - Mark Bartosik
50
Visual Basic Support
• All Visual Basic objects use the apartment model. All
objects exist in an STA.
• To avoid the problem of concurrent access global
variables are really “per apartment” variables. Since
most Visual Basic objects exist in the primary STA, in
effect global are usually per process. However, you
should not rely upon this.
A practical guide to writing
multithreaded code - Mark Bartosik
51
Visual Basic Support
• To serve as a simple demonstration we have two simple
Visual Basic programs. Code listings are provided.
• Depending on how the server (counter) is compiled with
regard to threading dramatically affects how the client
code works.
• The moral is, beware of module level variables in VB.
Otherwise called globals. They are thread safe, but make
the code very hard to test in a multithreaded environment,
and are open to miss use.
A practical guide to writing
multithreaded code - Mark Bartosik
52
Visual Basic globals
Typically both objects will be in the same apartment,
the primary STA.
STA
Class1
g_number
Class1
In this case the client is accessing two objects
that are both in the same apartment.
A practical guide to writing
multithreaded code - Mark Bartosik
53
Visual Basic globals
• However depending on the build options and where
the objects are obtained from the following situation
can arise:
STA
STA
Class1
Class1
g_number
g_number
In this case the client is accessing two objects in
different apartments.
A practical guide to writing
multithreaded code - Mark Bartosik
54
Visual Basic and reentrance
Private m_divisor as Integer
Private Sub Form_Load
m_divisor = 1
End Sub
Public Sub foo()
m_divisor = 0
End Sub
Private Sub Bar()
If m_divisor <> 0 Then
RaiseEvent some_event ‘ or otherwise yield
MsgBox 100 / m_divisor
End If
End Sub
A practical guide to writing
multithreaded code - Mark Bartosik
55
Visual Basic and reentrance
• When the event is raised there is nothing stopping a
client that handles that event from calling foo().
• Even if none of the clients call foo(), we cannot be
sure that one of the clients is not in another
apartment. If there is a client in another apartment it
will be called through lrpc, and will yield. Yielding
permits COM to unblock a previously blocked request
either foo() or bar().
• The ideal solution is to queue the event and raise it
later, but this is a lot of trouble in VB.
A practical guide to writing
multithreaded code - Mark Bartosik
56
Reentrance solutions
Private Sub Bar()
If m_divisor <> 0 Then
dim divisor as Integer
divisor = m_divisor
‘ copy the member variable
RaiseEvent some_event ‘ or otherwise yield
MsgBox 100 / divisor
End If
End Sub
or
Private Sub Bar()
If m_divisor <> 0 Then
MsgBox 100 / m_divisor
RaiseEvent some_event ‘ or otherwise yield
End If
End Sub
A practical guide to writing
multithreaded code - Mark Bartosik
57
Accessing multithreaded objects
Sub Bar()
If x.number <> 0 Then
call ExpensiveOperation()
MsgBox 100 / x.number
End If
End Sub
Just because only one thread at a time can access
your object do not assume that other objects cannot
be accessed by more than one thread!
A practical guide to writing
multithreaded code - Mark Bartosik
58
VB - what have we learnt ?
• Be sure that you understand the meaning of a global
variable. VB does not have real globals.
• Global constants are okay.
• Per object variables (member variables) have a more
clear meaning, but beware of reentrance.
• When possible raise events at the end of functions.
• Avoid calling across apartments, if you must, prefer
to do it at the end of functions.
• Never yield (never call DoEvents)
• The state of other objects can change underneath
you.
A practical guide to writing
multithreaded code - Mark Bartosik
59
Questions ?
• How many people want to cover Java
support for threads ?
Or shall we move on to C++ ?
My knowledge of Java is not deep
A practical guide to writing
multithreaded code - Mark Bartosik
60
Java support
• Java has more powerful and fine grain support for
threads than does VB, it is much more like C++.
• Threads can be created, waited upon, suspended,
synchronized.
• The keyword synchronized can be applied to blocks
of code, member functions, or static class functions.
• All variables can be waited on or signaled, or locked.
• Some standard Java classes are thread safe and
others are not. Read the documentation!
A practical guide to writing
multithreaded code - Mark Bartosik
61
Java support
• Per object synchronization
– Typically this is used to protect member variables of an
object being simultaneously accessed by two threads.
public class MyClass
{
public synchronized void foo()
{
}
Conceptually there is an extra member variable in each object
which is locked when the any synchronized method is entered
and unlocked when it is quit.
A practical guide to writing
multithreaded code - Mark Bartosik
62
Java support
• Per class synchronization
– Typically this is used to protect static members of a class
from being simultaneously accessed by two threads.
public class MyClass
{
public synchronized static void bar()
{
}
In this case the Java virtual machine grabs a lock on the “class
object”. The “class object”, is a special Java object that
describes the class. There is one class object for each class.
A practical guide to writing
multithreaded code - Mark Bartosik
63
Java support
• Block synchronization
– Typically this is used provide finer gain protection to
variables.
public void foo()
{
synchronized(this);
This is equivalent to synchronizing the whole function, however,
we could just synchronize the else part of an if statement, or any
block of code.
A practical guide to writing
multithreaded code - Mark Bartosik
64
C++ support
• In the standard C++ has no built in support for
threads.
• However compiler vendors usually provide some
support for threads.
• Since C++ is extensible (via classes) it potentially has
the most powerful support.
• Many of the issues here also apply to Java.
A practical guide to writing
multithreaded code - Mark Bartosik
65
C++ process memory
See code handout with same title
thread local storage
(TLS)
t
7
static storage (fixed size)
g
8
20
accessible by all threads
heap
stacks
(one per thread)
1024 elements
(4096 bytes)
0xDEADBEAF
a
21
b
22
c
23
int_ptr
int_array
?
stacks
grow on
demand
heap
grows on
demand
?
r
a
n
d
o
m
accessible by all threads
everything
created with
'new'
0x17ED5423
0xDEADBEAF
0x17ED5423
1000
A practical guide to writing
multithreaded code - Mark Bartosik
all COM
objects
66
C++ process memory
• C++ has
– automatic data (allocated from stack)
– has static data
(static data can be scoped per process, file, class,
struct or function).
– heap / free store data
– per machine data (vendor specific)
– per thread data (vendor specific)
– potentially we could extend C++ via a class (most
likely a smart pointer) to support per apartment
data
A practical guide to writing
multithreaded code - Mark Bartosik
67
C++ memory
• Clearly C++ has the most flexible scheme.
• However since there is no built in support for threads
we must be careful to protect the data that can be
accessed by multiple threads. This means protecting
all static and heap based data.
• Actually the lack of built in support, as we will see
latter can be an advantage.
A practical guide to writing
multithreaded code - Mark Bartosik
68
The correct way to use locks
with non-member data
static critical_section_t my_critsec; // protects data below
static my_data_t my_data1;
static my_data_t my_data2;
void foo()
{
cs_lock_t auto_lock(my_critsec);
my_data1.do_something();
my_data2.do_something();
}
• Here we are protecting some per file static data my_data1
and my_data2. The cost is very small, and is very similar
to use to Java’s built in support.
A practical guide to writing
multithreaded code - Mark Bartosik
69
The correct way to use locks
with member data
class my_class_t
{
.
.
mutable critical_section_t m_critsec; // protects data below
my_data_t m_data;
};
// to be able to use const you must declare the critical section member as mutable
void my_class_t::foo() const
{
cs_lock_t auto_lock(m_critsec);
m_data.do_something();
}
A practical guide to writing
multithreaded code - Mark Bartosik
70
Deadlocks - the problem
crit_sect_t cs_a;
object_t
obj_a;
int
a_val;
crit_sect_t cs_b;
object_t
obj_b;
int
b_val;
void foo()
{
cs_lock_t lock1(cs_a);
obj_a.process(a_val);
deadlock
do_something();
void bar()
{
cs_lock_t lock1(cs_b);
obj_b.process(b_val);
deadlock
do_something_else();
cs_lock_t lock2(cs_b);
obj_b.process(b_val);
}
cs_lock_t lock2(cs_a);
obj_a.process(a_val);
}
A practical guide to writing
multithreaded code - Mark Bartosik
71
Deadlocks a solution
void foo()
{
{
cs_lock_t lock1(cs_a);
obj_a.process(a_val);
}
void bar()
{
{
cs_lock_t lock1(cs_b);
obj_b.process(b_val);
}
do_something();
do_something_else();
{
{
cs_lock_t lock2(cs_a);
obj_a.process(a_val);
cs_lock_t lock2(cs_b);
obj_b.process(b_val);
}
}
}
}
A practical guide to writing
multithreaded code - Mark Bartosik
72
Deadlocks - another solution
void foo()
{
cs_lock_t lock1(cs_a);
cs_lock_t lock2(cs_b);
obj_a.process(a_val);
void bar()
{
cs_lock_t lock1(cs_a);
cs_lock_t lock2(cs_b);
obj_b.process(b_val);
do_something();
obj_b.process(b_val);
}
do_something_else();
obj_a.process(a_val);
}
A practical guide to writing
multithreaded code - Mark Bartosik
73
Deadlocks - subtle risks
void my_class_t::foo()
{
cs_lock_t auto_lock(m_crit_sec);
m_client->update();
m_val = m_val + 1;
}
• The problem is that we have no knowledge and no
control over what client does in the call to update(). If
the client attempts to lock a resource we could hit a
variation on our old deadlock situation.
A practical guide to writing
multithreaded code - Mark Bartosik
74
Dead locks - more subtle risks
void my_class_t::foo()
{
cs_lock_t auto_lock(m_crit_sec);
FireUpdateEvent();
m_val = m_val + 1;
}
• The problem is exacerbated by Microsoft. Microsoft
code almost always fires events synchronously.
• To be fair C# has support for asynchronous events.
A practical guide to writing
multithreaded code - Mark Bartosik
75
A possible work around
void my_class_t::foo()
{
{
cs_lock_t auto_lock(m_crit_sec);
m_val = m_val + 1;
}
FireUpdateEvent();
}
• This avoids the risk of deadlock but is not always
practical. Also we do not know whether the caller of
foo() has locked a resource. The only reliable
solution is to use asynchronous events.
A practical guide to writing
multithreaded code - Mark Bartosik
76
Support for asynchronous events
• A good implementation is non-trivial.
• Neither ATL, MFC, Visual Basic, or Java support
asynchronous events (C# does).
• Do not implement your own per case solutions. Use
library code.
• I know of only one good source for reusable source
code to implement asynchronous events in C++:
www.cuj.com/archive/1610/1610list.html
If you would like to use this, please see me because
I’ve got some improvements and fixes for it, also
there are issues with scripting clients.
A practical guide to writing
multithreaded code - Mark Bartosik
77
Deadlocks - very subtle risks
static file_scoped_critsec;
void my_class_t::foo()
{
cs_lock_t auto_lock(file_scoped_critsec);
p_low_level_obj->foo();
m_val = m_val + 1;
} DOWNWARD CALLS ARE OKAY - HAVE A DIAGRAM!!!!!
• Avoid having global locks if you can. Minimize the
visibility of lockable resources.
• If you must have one, beware of what you do whilst
holding it. Do not do anything that could result in a
cross-apartment call.
A practical guide to writing
multithreaded code - Mark Bartosik
78
Global lock example
MTA
STA
Object A
Object B
Object A locks the global (non-member) resource, and
calls into object B. Object B also requires the global
resource, and waits.
A practical guide to writing
multithreaded code - Mark Bartosik
79
Causality
causality n : the relation between causes and effects
• The operating system applies locks based on
the thread id, not the causality. As long as you
stay with the its framework COM is able to
track the causality.
• In the previous slide there were two threads
but only one causality.
A practical guide to writing
multithreaded code - Mark Bartosik
80
Causality
MTA
STA
Object A
Object B
Object A is blocked in an RPC call to object B. Object B calls back
object A, but the system (COM+) knows that this is the same
causality and allows the RPC to unblock to allow the call to
continue (otherwise it would deadlock).
A practical guide to writing
multithreaded code - Mark Bartosik
81
Breaking causality
• If object B was to create another thread and that
thread was to attempt to call back object A then we
would have fooled COM’s causality checking, and
dead lock will ensue.
A practical guide to writing
multithreaded code - Mark Bartosik
82
Read/Write locks
• So far we have locked our resource regardless of
whether the client is just reading or modifying the
data. Only one client at a time can have any access.
• This is inefficient. Because we could have 1000
clients all wanting to read the data, and they will not
interfere with each other, so they could all have
access in parallel.
• What we need is a different type of lock. The lock will
only deny access to readers if a writer holds the lock.
The lock will deny access to writers if anyone holds
the lock.
A practical guide to writing
multithreaded code - Mark Bartosik
83
Usage
int my_class_t::get_value() const
{
reader_lock auto_lock(m_critsec);
return m_value;
}
void my_class_t::modify_value(int new_value)
{
writer_lock auto_lock(m_critsec);
m_value = new_value;
}
• The get_value function locks the object for reading, and the
modify_value function locks it for writing. However, it is up to the
developer to correctly choose whether to perform a read or write
lock. It is possible to have the code automatically select a read
or write lock - advanced topic.
A practical guide to writing
multithreaded code - Mark Bartosik
84
Temporary freezes
void my_class_t::foo()
{
cs_lock_t auto_lock(m_crit_sec);
potentially_expensive_off_machine_function_call();
}
int my_class_t::get_trivial_value()
{
cs_lock_t auto_lock(m_crit_sec);
return m_trivial;
}
• Never hold a lock for an extended period of time.
The function foo is called from a background thread. So m_crit_sec
remains locked for several seconds. The UI thread attempts to call
get_trivial_value and it is also blocked.
A practical guide to writing
multithreaded code - Mark Bartosik
85
When not to use a lock
• Locks add complexity, cost CPU cycles to acquire and release,
and risk deadlock, so it’s best to avoid them when you can.
• When the data is only accessed by one thread:
– Automatics (stack based)
– Thread local storage
• Data that does not change:
– const
• Stateless objects
• Naturally self consistent data without race conditions:
– e.g. a single bool set by the user, and read by a back
ground thread.
A practical guide to writing
multithreaded code - Mark Bartosik
86
Good use of const
const double pi = 3.14159265359;
class person_t
{
public:
my_class_t(const std::string & fore_name,
const std::string & family_name) :
m_fore_name(fore_name)
m_family_name(family_name)
{}
std::string get_fullname() const
{
return m_fore_name + m_family_name;
}
private:
mutable crit_sec_t m_critsec;
const std::string m_fore_name;
const std::string m_family_name;
// point 0
// point 1
// point 2
// point 3
// point 4
};
A practical guide to writing
multithreaded code - Mark Bartosik
87
Good use of const
0> const type indentifier = value;
Most people’s simple view of “const”, an item of data (often global)
with a fixed value.
1> foo(const T & arg)
Pass by “const reference”. This means that the function promises not to
modify the arguments that have been passed by reference.
2> class_t::func() const
A “const member function” . This means that the function promises not
to modify the “logical state” of the object. It may change the bit pattern
of “mutable” members but it will preserve the state of the object.
A practical guide to writing
multithreaded code - Mark Bartosik
88
Good use of const
3> mutable type identifier;
A “mutable data member”. This declares a member variable that does
not contribute to the logical state of the object. It can be modified by
“const member functions”.
4> const type identifier;
A “const data member”. This declares a data member that never
changes. If it is to have a value it must be given that value immediately
when it is created.
A practical guide to writing
multithreaded code - Mark Bartosik
89
Good use of const
• The code for person_t is almost completely thread
safe. Once constructed the data clearly never
changes. So no locks are required.
• “almost”
Well there is the issue of lifetime. Sometimes this is
simple, sometimes not.
• In a Microsoft environment we can use COM to do
reference counting. We can inherit a very light weight
implementation of IUnknown. Or we can use
boost::shard_ptr (but make it thread safe).
A practical guide to writing
multithreaded code - Mark Bartosik
90
Guidance for locks and const
• In general, acquire a lock only on externally visible functions
member functions. That means public or protected or any virtual
functions, plus functions called internally by asynchronous
means. The remaining internal functions will not need a lock, it
will have already been acquired.
• Acquiring a lock within internal functions is harmless, but there
is a performance penalty.
• If you wish to distinguish read and write locks:
– Non-const functions require a write lock.
– const functions require at least a read lock.
A practical guide to writing
multithreaded code - Mark Bartosik
91
Guidance for locks and const
• Non-private member functions that only access const members
need not acquire a lock. Such functions may not call other
member functions.
public:
void get_foo() const // no lock required
{
return m_foo;
}
private:
const int m_foo;
mutable crit_sect_t m_critsec;
A practical guide to writing
multithreaded code - Mark Bartosik
92
Guidance for locks and const
• Non-private or virtual const member functions need only acquire
a read lock.
public: // or protected
void foo1() const
{
read_lock_t auto_lock(m_critsec);
.
}
virtual void foo2() const
{
read_lock_t auto_lock(m_critsec);
.
}
A practical guide to writing
multithreaded code - Mark Bartosik
93
Guidance for locks and const
• Other non-private or virtual member functions must acquire a
write lock.
public: // or protected
void foo3()
{
write_lock_t auto_lock(m_critsec);
.
}
virtual void foo4()
{
write_lock_t auto_lock(m_critsec);
.
}
A practical guide to writing
multithreaded code - Mark Bartosik
94
Guidance for locks and const
• Private non virtual functions need not acquire locks.
private:
void foo5()
{
.
}
• “Non-private” == public or protected
• Restrict the visibility of functions as much as possible.
• This assumes that mutable data is thread safe, and that threads
are not internally created on private member functions.
A practical guide to writing
multithreaded code - Mark Bartosik
95
Granularity
void my_class_t::append(const std::string & new_data)
{
if (new_data.empty())
{
return;
}
else
{
cs_lock_t auto_lock(m_crit_sec);
m_existing_data = m_existing_data + ‘,’ + new_data;
}
}
A practical guide to writing
multithreaded code - Mark Bartosik
96
Granularity
void my_class_t::append(const std::string & new_data)
{
if (new_data.empty())
return;
cs_lock_t auto_lock(m_crit_sec);
m_existing_data = m_existing_data + “,” + new_data;
}
• Either put the lock at the beginning of the function, or soon after
the beginning. If it is hidden in the depths of the function it is too
easy for a maintenance programmer to break the code. On a
uniprocessor machine it is only rarely a performance
optimization, unless of course there are expensive off machine
calls involved.
A practical guide to writing
multithreaded code - Mark Bartosik
97
Thread local storage
DWORD TlsAlloc();
BOOL TlsSetValue(DWORD, LPVOID);
LPVOID TlsGetValue(DWORD);
BOOL TlsFree(DWORD);
__declspec(thread) int my_per_thread_var = 0;
• Be warned, __declspec(thread) is unsafe to use in a DLL loaded
using LoadLibrary, i.e. any COM DLLs.
• Can be used to detect recursion in a multithreaded environment.
A practical guide to writing
multithreaded code - Mark Bartosik
98
Per machine storage
-------- Shared.h -------namespace shared
{
extern bool __declspec(dllexport) per_machine_val;
}
------- Shared.cpp ------#pragma data_seg(".ANAME")
#pragma comment(linker, "/SECTION:.ANAME,RWS")
namespace shared
{
bool per_machine_val = false;
}
#pragma data_seg()
A practical guide to writing
multithreaded code - Mark Bartosik
99
Per machine storage
• You cannot store pointers in this storage.
• If you have consistency issues, you must use a
Mutex rather than a critical section for serialization.
• The DLL containing shared::per_machine_val must
be loaded into the each process sharing the data.
• If you have multiple copies of the DLL on the machine
there is potential for multiple out-of-sync instances of
the global value.
• The value is initialized when the DLL is first loaded.
A practical guide to writing
multithreaded code - Mark Bartosik
100
Atomic calls
• Consider a rectangle with the following properties:
– set/get: center, width, height
– set/get: top_left, top_right, bottom_left, bottom_right
– set(center, width, height) , get(center, width, height)
• The first interface gives an inherently self consistent view of an
object. However, it is still possible for the details of a valid but
non-existent rectangle to be read by another thread.
• The second interface is just plain dangerous. It is possible at
any given instant for the object to be any quadrangle, not a
rectangle at all (a invalid state).
• The third interface is safe but unfriendly.
A practical guide to writing
multithreaded code - Mark Bartosik
101
Atomic calls
• The main alternatives are:
– To provide a lock and unlock method
– Provide a copy/clone method
– Provide a snapshot method
A practical guide to writing
multithreaded code - Mark Bartosik
102
Atomic calls
• lock/unlock methods can be both risky and inefficient. Risky
because the client could lock but forget to unlock an object. The
alternative is to provide a helper class that does the lock in its
constructor and unlock in its destructor, and only expose the
lock/unlock methods to the helper class. It can also be inefficient
with highly used global objects because it can lead to
contention.
• copy or clone methods may be inappropriate especially if the
object has an identity e.g. an employee object.
• snapshot methods add the complexity of a new interface with
just read-only properties. Copying the internal state of the
original object may be expensive.
A practical guide to writing
multithreaded code - Mark Bartosik
103
Summary of rules for locks
•
•
•
•
•
•
•
•
•
•
•
•
•
•
When possible hold at most one lock at a time.
Acquire and release locks in “nested” order.
If you must acquire multiple locks always do it in the same order.
Read/write locks can be more efficient.
Hold locks for the minimum time period (no off machine calls).
Do not call external (non-downward) code whilst holding a lock.
Use asynchronous events and callbacks.
Use smart or very smart lock classes.
Never expose a public lock/unlock method - it invites abuse.
Each object should protect its own data.
Make maximum use of const.
Make use of mutable.
Only use a lock when there is a need.
Follow the “Guidance for locks and const”
A practical guide to writing
multithreaded code - Mark Bartosik
104
Library support - locks
• In large scale multithreaded development we
want the following facilities in locks:
–
–
–
–
–
–
–
–
Exception safety
Correct copy/assignment semantics
Automatic read/write locking
Fixed efficient DLL interface on lockable object
Warning when lock held too long (freeze)
Minimum overhead in a single threaded environment
Adapts to operating system (selects fastest method available)
Optional deadlock detection
A practical guide to writing
multithreaded code - Mark Bartosik
105
Library support - events
• Facilities we want for events and
callbacks
– The ability to fire events and callbacks
asynchronously
– Support for multiple arguments
– Type safety
– Scripting support
A practical guide to writing
multithreaded code - Mark Bartosik
106
Transactions
• Transactions may occur in parallel threads, and typically lock
database records. Database locks like any other lock should not
be held for any length of time, but database lock are relatively
expensive.
• Keeping transactions short is essential in a multiuser
transactional system. Transactions must be isolated from each
other. This means that objects currently working on Transaction
A cannot see any of the data being used by other objects
working on Transaction B. Neither transaction knows whether
the other is going to commit or abort, so they don't know
whether the data in the other ongoing transaction is good or not.
You can not be allowed to see data that is potentially bogus for
fear you would misuse it, so you get “locked out”.
A practical guide to writing
multithreaded code - Mark Bartosik
107
Transactions
• Keep the granularity fine, lock the minimum of data.
• When possible use a “contra-transaction”. Immediately commit
data without waiting for a user response / confirmation, and
achieve the effect of a role back by using a “contra-transaction”.
– For example, do not lock the credit for a bank whilst waiting for a
trade to be confirmed. Instead draw down the credit immediately,
and if the deal is not confirmed, generate a contra transaction to
restore the credit.
A practical guide to writing
multithreaded code - Mark Bartosik
108
Brief summary
• Clearly model threads and apartments in your design,
and the illustrate the context in which objects and
code run.
• Use const to the maximum.
• Beware locks and expensive calls.
• Avoid global or shared data or objects.
• Uses asynchronous events and callbacks.
• Use very smart locks.
• Use fine granularity with transactions and consider
“contra-transactions”.
A practical guide to writing
multithreaded code - Mark Bartosik
109
Brief summary
• COM
– VB is okay for the UI and strictly apartment model code.
– C++ provides the most flexibility for infrastructure. It can
exist in an MTA.
• COM+
– Consider using threading model ‘Neutral’.
– Consider letting COM+ do the synchronization (‘Required’).
• Read the code handouts
– critical_section_t, cs_lock_t, asynchronous event firing,
GITPtr
A practical guide to writing
multithreaded code - Mark Bartosik
110
Commonly misused functions
• CreateThread
– Use _beginthread, _beginthreadex or AfxBeginThread.
• _beginthread, _beginthreadex
– Read what MSDN says about the handles returned.
• Sleep(0), SleepEx(0, TRUE)
– The former will yield your thread, the later enters an alertable wait
state.
• SendMessage
– This will cause two thread swaps if the window is in a different
thread.
• DllMain
– NT holds the process critical section whilst calling DllMain, the
same one acquired by GetModuleHandle among other functions.
MSJ 96 sidebar.
A practical guide to writing
multithreaded code - Mark Bartosik
111
Areas not covered
• Win32 thread primitives
• Priorities and scheduling
• Out of process schemes using COM
A practical guide to writing
multithreaded code - Mark Bartosik
112
Recommended reading
• Books
– Most books about threads are like vegetables,
they are sold by weight, and past their sell by
date.
– There is no single source.
– Java Threads - Scott Oaks & Henry Wong
about 300 pages, well presented, covers general issues in a Java context, and Java
specifics.
A practical guide to writing
multithreaded code - Mark Bartosik
113
Recommended reading
• Internet
– http://www.lsc.nd.edu/~jsiek/tlc/btl/libs/threads/Synchronization.html
www.wdj.com/archive/1105/feature.html
– comp.programming.threads (news group)
– www.cuj.com/archive/1610/1610list.html
– www.boost.org (will probably support threads in the future)
• Products
–
–
–
–
–
www.cs.utexas.edu/users/emery/hoard/
www.microquill.com smart heap
www.roguewave.com threads++
www.objectspace.com The Foundations Thread<ToolKit>
www.ooc.com/jtc/ JThreads/C++
A practical guide to writing
multithreaded code - Mark Bartosik
114
Recommended reading
• MSDN
– Apartment-Model Threading in Visual Basic
– Neutral Apartments
– Windows 2000 Brings Significant Refinements to the
COM(+) Programming Model
• item: Contexts and Apartments
– Services provided by COM+
• item: Concurrency
A practical guide to writing
multithreaded code - Mark Bartosik
115
More help?
• If you would like more detail in a particular area or
some help or guidance. For example, writing the
really smart locking class, extending GITPtr, or event
firing….
– I am very happy to help
extension 6830, cube 3412
email:[email protected]
– More than an hour? This will probably require you to bribe
Lauren Lenoble and/or Angelo Bartolotta.
A practical guide to writing
multithreaded code - Mark Bartosik
116