people.sutd.edu.sg

Download Report

Transcript people.sutd.edu.sg

50.003: Elements of Software
Construction
Week 4
Introduction to Concurrency
Plan of the Week
• First Class:
– What are sequential programs semantically? The
model we use to understand sequential programs
is not precise for concurrent programs.
• Second Class:
– Why do we want concurrent programs?
– Concurrent processes communicating through
messaging only
Language and Tools
http://www.eclipse.org/
Part 1
UNDERSTANDING SEQUENTIAL
PROGRAMS
Motivational Example
NSA actually intercepted a RSA-encrypted
secrete message which tells the location of a
terrorist act, we believe that the act is going to
happen one week from now, we need your help
in decrypting the message.
Task: Write a Java program to factor a number as the product of two
prime numbers using the method called trivial division.
Task Breakdown
• Requirements/Specification
– given a semi-prime, your program outputs its
prime factors within certain time
green: pre-condition
red: post-condition
purple: non-functional requirement
Correctness: pre-condition => post-condition
Task Breakdown
• Design
– Use the trial division method
– Read: http://en.wikipedia.org/wiki/Trial_division
– More:
http://en.wikipedia.org/wiki/Integer_factorization
• Implementation
– “Enough talk, let’s fight” (Kong Fu Panda)
Cohort Exercise I (10 min)
Write a Java program such that given a semiprime, outputs its prime factors.
Hint: You need to use the BigInteger class.
FactorPrime.java
Task Breakdown
• Testing
– 4294967297 (famous Fermat Number)
– 1127451830576035879
–
160731047637009729259688920385507056726966793490579598495689711866432421212774967029895340327
197901756096014299132623454583177072050452755510701340673282385647899694083881316194642417451
570483466327782135730575564856185546487053034404560063433614723836456790266457438831626375556
854133866958349817172727462462516466898479574402841071703909138062456567624565784254101568378
407242273207660892036869708190688033351601539401621576507964841597205952722487750670904522932
328731530640706457382162644738538813247139315456213401586618820517823576427094125197001270350
087878270889717445401145792231674098948416888868250143592026973853973785120217077951766546939
577520897245392186547279572494177680291506578508962707934879124914880885500726439625033021936
728949277390185399024276547035995915648938170415663757378637207011391538009596833354107737156
273037494727858302028663366296943925008647348769272035532265048049709827275179381252898675965
528510619258376779171030556482884535728812916216625430187039533668677528079544176897647303445
153643525354817413650848544778690688201005274443717680593899
• Verification: how to show it always works?
Understanding Sequential Programs
“A program consisted of a sequence
of instructions (and a memory),
where each instruction executed
one after the other (to modify the
memory, etc.). It ran from start to
finish on a single processor.”
“The sequential paradigm has the
following two characteristics: the
textual order of statements specifies
their order of execution; successive
statements must be executed
without any overlap (in time) with
one another.”
int previousMax;
public int max (int[] list) {
int max = list[0];
for (int i = 1; i < list.length; i++) {
if (max < list[i]) {
max = list[i];
}
}
previousMax = max;
return max;
}
“Sequential programming is dead. So stop teaching it!”
The Illusion
int previousMax;
0
0. public int max (int[] list) {
1. int max = list[0];
2. for (int i = 1; 3. i < list.length; 4. i++) {
5.
if (max < list[i]) {
6.
max = list[i];
7.
}
8. }
list = …
1
max = list[0]
2
9. previousMax = max;
9
10. return max;
previous=max
11. }
return max
10
4
i >= list.length
i++
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
Control Flow Graph
memory
0
previousMax
…
input
…
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
list = …
[2,4]
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
2
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
1
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
2
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
System Execution
memory
0
previousMax 0…
input
[2,4]
list
[2,4]
max
4
i
2
list = …
1
max = list[0]
2
9
previous=max
return max
10
i >= list.length
i++
4
11
8
7
3
i < list.length
5
max >= list[i]
…
i=1
max = list[i]
max < list[i]
6
The Trace
• With input = [2,4]
0
1
2
3
5
6
10
9
3
4
8
7
11
i
…
: a configuration of the program with control at line i
The Trace
• With input = [4,2]
0
1
2
3
5
10
9
3
4
8
11
i
…
: a configuration of the program with control at line i
7
Sequential Programming is Easy
• It is deterministic: with one input, there is one
deterministic path through control flow graph
input1
input2
input3
input4
input5
0
0
0
0
0
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
Testing is to find the ‘right’ input
…
Cohort Exercise 2 (5 min)
Draw the control flow graph of the program you
develop in Cohort Exercise 1.
Reality is Messy
Java Programs
Bytecode
What are the atomic steps? For
example, how many steps are “i++”?
JVM
What are the order of execution?
Given “i++; j++; i++”, can we switch the
last two statements?
Physical Machine
Where are the variable values stored?
Cache, heap memory, stack memory.
All these details didn’t matter until concurrency is here.
Part 2
INTRODUCING CONCURRENCY
Concurrency: Benefit
• Better resource utilization
– With k processors, ideally we can be k times faster,
if the task can be broken into k independent pieces and if we ignore the cost of task decomposition and communication between the processors
Processor:
Read file A
Process A
Read file B
Process B
time
We can factorize the semi-prime faster with multiple computers or cores
Concurrency: Benefit
• Better resource utilization
– With k processors, ideally we can be k times faster,
if the task can be broken into k independent pieces and if we ignore the cost of task decomposition and communication between the processors
Processor 2:
Read file A
Read file B
Processor 1:
Process A
Process B
time
• Can we get better performance with 1 processor
only?
Read file A
Processor:
Read file B
Process A
time
Process B
Concurrency: Cost
• More complex design, implement, testing,
verification
public class Holder {
Will the exception occur?
private int n;
public Holder(int n) { this.n = n; }
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}
• Overhead in task decomposition, communication,
context switch
• Increased resource consumption
Distributed Systems
CPU
CPU
Memory
Memory
messages
messages
…
…
CPU
Memory
messages
Network
• Each process has its own memory and
processes communicate through messaging.
Multi-core Processors
CPU
CPU
Cache
Cache
…
…
CPU
Cache
Memory
• Each thread has its cache and threads
communicate through a shared memory.
Multi-core Computer: More Like This
Part 3
PROGRAMMING WITH SOCKETS
Processes
• A process is an instance of a running program
that is isolated from other processes on the same
machine (particularly for resources like memory)
• Tries to make the program feel like it has the
whole machine to itself – like a fresh computer
has been created, with fresh memory
• By default, processes have no shared memory
(needs special effort)
• Automatically ready for message passing
(standard input & output streams)
Interprocess Communication
• Using sockets
• Using input and out streams
• Using RMI or CORBA
– RMI: remote method invocation
– CORBA: common object request broker
architecture
We will not talk about or use RMI or CORBA.
Read: http://en.wikipedia.org/wiki/Java_remote_method_invocation
http://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture
Addresses
• IP Addresses: “172.18.180.17” or
“fe80::7517:c1af:b2bb:da73%4”
– Like a street address, it gives the location of a
computer so that messages can be sent there
– “localhost”: local host
• Ports
– A single computer may be performing many kinds
of network communications
– Port 20: FTP; 23: Telnet; 80: HTTP
Sockets Example
Write a Java program which allows one client
“talks” to one server (may or may not be on the
same computer).
EchoServer.java and EchoClient.java
Cohort Exercise 3 (15 min)
Write a Java program which allows a fixed N
processes to talk in a fixed order (assume that a
process is serving as a server).
Click here for a sample solution: ChatServer.java, ChatClient.java
What if we want to allow any client to talk at any time?
Control Flow Graph
memory:
…
0
memory:
…
0
serverSocket= …
clientSocket= …
out = …
In =…
stdIn = …
1
1
input=in.readLine()
out.println(…)
hostName= …
portNumber= …
echoSocket = …
out = …
In =…
stdIn = …
userInput=stdIn.readLine()
2
2
5
input is “Bye”
otherwise
3
stdIn.readLine()
otherwise out.println(…)
3
5
userInput is “Bye”
4
…
…
server
in.readLine()
4
client
Configuration 0-0
memory:
…
0
memory:
…
0
serverSocket= …
clientSocket= …
out = …
In =…
stdIn = …
1
1
input=in.readLine()
out.println(…)
hostName= …
portNumber= …
echoSocket = …
out = …
In =…
stdIn = …
userInput=stdIn.readLine()
2
2
5
input is “Bye”
otherwise
3
stdIn.readLine()
otherwise out.println(…)
3
5
userInput is “Bye”
4
…
…
server
in.readLine()
4
client
Configuration 1-0
memory:
…
0
memory:
…
0
serverSocket= …
clientSocket= …
out = …
In =…
stdIn = …
1
1
input=in.readLine()
out.println(…)
hostName= …
portNumber= …
echoSocket = …
out = …
In =…
stdIn = …
userInput=stdIn.readLine()
2
2
5
input is “Bye”
otherwise
3
stdIn.readLine()
otherwise out.println(…)
3
5
userInput is “Bye”
4
…
…
server
in.readLine()
4
client
Configuration 0-1
memory:
…
0
memory:
…
0
serverSocket= …
clientSocket= …
out = …
In =…
stdIn = …
1
1
input=in.readLine()
out.println(…)
hostName= …
portNumber= …
echoSocket = …
out = …
In =…
stdIn = …
userInput=stdIn.readLine()
2
2
5
input is “Bye”
otherwise
3
stdIn.readLine()
otherwise out.println(…)
3
5
userInput is “Bye”
4
…
…
server
in.readLine()
4
client
Non-determinism
0-0
1-0
0-1
• The choice between 1-0 and 0-1 is determined at
run-time, depending on the relative speed of
server and client.
• Even if you know all the inputs, there might be
multiple different traces.
• Testing is to find the ‘right’ trace.
Atomicity
memory:
…
0
0-0
serverSocket= …
clientSocket= …
out = …
In =…
stdIn = …
1-0
1
input=in.readLine()
out.println(…)
2
5
input is “Bye”
otherwise
3
stdIn.readLine()
4
…
server
Is this really one step or many?
What is an atomic step?
What are the impact of the
answer of the last question?
Configuration 11
0
1
0
serverSocket= …
clientSocket= …
out = …
In =…
stdIn = …
1
input=in.readLine()out.println(…)
hostName= …
portNumber= …
echoSocket = …
out = …
In =…
stdIn = …
0-0
userInput=stdIn.readLine()
…
…
1-0
0-1
2
2
otherwiseout.println(…)
5
input is “Bye”
otherwise
3
stdIn.readLine()
3
5
1-1
0-2
userInput is “Bye”
in.readLine()
4
…
4
…
server
1-2
client
…
3-3
2-3
…
5-3
4-3
1-3
Cohort Exercise 4 (15 min)
Continue with cohort exercise 1. Assume that
you are given four computers, one acting as a
server and the three clients do the actual work
of factoring and display the result as soon as it is
ready. The server knows the semi-prime and
assigns the job.
Q1: How do we assign the tasks?
Q2: How to terminate other computers when one solves it?
FactorPrimeClient.java; FactorPrimeServer.java
Timeout
• By default, the following are blocking.
– Example 1: Socket clientSocket1 = serverSocket.accept();
– Example 2:
BufferedReader in = new BufferedReader(new
InputStreamReader(clientSocket.getInputStream()));
String inputLine = in.readLine();
• Blocking leads to deadlock.
– A deadlock is a state where no instruction can be
executed (and not all programs have terminated).
SocketTimeout.java
Cohort Exercise 5 (5 min)
Continue with cohort exercise 4, assuming this
time that you don’t know how many clients are
there to help you. Use socket timeout to collect
clients and then assign the jobs.
FactorPrimeClientMul.java; FactorPrimeServerMul.java