Using SCTP to hide latency in MPI programs Brad Penoff, H. Kamal, M.

Download Report

Transcript Using SCTP to hide latency in MPI programs Brad Penoff, H. Kamal, M.

Using SCTP to hide latency
in MPI programs
Brad Penoff, H. Kamal, M. Tsai, E. Vong, A. Wagner
Department of Computer Science
University of British Columbia
Vancouver, Canada
Distributed
Systems
Group
April 25, 2006
Overview

Motivation

SCTP

Processor Farm Implementation

Examples
Motivation

Extend the operability of MPI messagepassing applications to WAN





High latency (milliseconds)
Congestion and loss
Standard IP transport mechanisms
Heterogeneous environment
Why?



Suitable for some compute-intensive applications
Interoperability; inter-cluster
Distributed resources
What is SCTP?
Stream Control Transmission Protocol
 IETF standardized IP transport protocol
 Message oriented like UDP
 Reliable, in-order delivery like TCP but with
multiple streams
 Available on most major operating systems
Why SCTP?

Added resilience





Multi-streaming
Improved congestion
control (e.g. built-in SACK)
Multi-homing
Added security
CPU 1
stream 1
stream 2
association
CPU 2
Message oriented
CPU 3
One-to-many
socket
SCTP-based MPI for WANs



Close match to MPI
Mapping tags to
streams avoids headof-line blocking
Automatically leverage
other SCTP features
SCTP
MPI
One-to-Many
Socket
Context
Association
Rank
Streams
Tag
Using SCTP for MPI applications

Automatic


SCTP helps to reduces the effect of segment loss
Need to change applications:

Use of tags to identify independent message streams


Overlap computation and communication (non-blocking
communication)
Avoid head of line blocking
MPI Applications for WAN?

Parallel task farms (pfarms)


Common strategy for large
number of independent tasks
Typical properties





Request driven, process tasks
at the rate of the worker
Dynamically load-balanced
Dynamic processes
Centralized or decentralized, if
necessary
Provided as a Template
Workers
Do Task
Manager
Create
Tasks
Process
Results
Developed Pfarm Template

Small API


Provided tunable parameters




createTask, doTask, processResult
Number of outstanding requests
Number of available buffers/worker
Number of tasks/request
Managed MPI non-blocking communication
Ideal case
Worker-do
task
RTT
ManagerCreate Task
Request
Manager-Process
Results
Create
Task 1
Task
Request
Task 1
Computation
Time
Task 2
Computation
Time
Create
Task 2
Task
Request
Result
Create
Task 3
Task
Result
Process
Result 1
Unable to hide latency
Worker-do
task
Manager-Process
Results
ManagerCreate Task
Requ
est
Create
Task 1
RTT
Task
Reque
Task 1
Computation
Time
st
Create
Task 2
Idle
Task
Requ
Task 2
Computation
Time
Result
est
Create
Task 3
Idle
Task
Result
Process
Result 1
Task buffering
Worker Process
buffer requests to hide latency
4
7
6
3
2
1
4
3
2
5
4
3
4
reques
ts
task
s
Varying task
times and
varying network
times (RTT)
Task buffering
Worker Process
buffer requests to hide latency
4
3
4
2
1
3
2
4
3
4
idle
7
6
5
reques
higher
latency
t
ts
k
as
s
Varying task
times and
varying network
times (RTT)
Program template
worker
program
while(tasks)
{
st
MPI_WaitAny(&j);
po
Isend(request); re
ad
y
do_task(j);
Isend(result);
post
Irecv(task);
}
middleware
transport layer
requests
results
send()
tasks
send()
j
recv()
advance()
Examples

Robust correlation matrix computation



Large regular matrix computation (our own
program)
Directly linked to template
mpiBLAST


Parallel version of popular bioinformatics tool
(existing program)
Integrated template into program
mpiBLAST
Latency and Loss
900
800
700
80ms
600
500
Time
0ms
400
20ms
40ms
0%
1%
2%
300
200
100
0
2%
SCTP
TCP
SCTP
1%
TCP
Increasing Latency
SCTP
TCP
0%
SCTP
TCP
Increasing Loss
Conclusions. What did we discover?

Some latency hiding techniques increase
performance regardless of transport

SCTP-based MPI handles latency/loss better
than TCP in real applications

Requires application changes to see full benefits



Non-blocking
Multiple tags to utilize streams
Head of line blocking in real applications
Thank you!
More information about our work is at:
http://www.cs.ubc.ca/labs/dsg/mpi-sctp/
Or Google “sctp mpi”
Upcoming annual SCTP Interop


July 30 – Aug 4, 2006 to be held at UBC
Vendors and implementers test their stacks


Performance
Interoperability