DMP: Deterministic Shared Memory Multiprocessing Joseph Devietti et al.

Download Report

Transcript DMP: Deterministic Shared Memory Multiprocessing Joseph Devietti et al.

ECE 259 / CPS 221 Advanced Computer Architecture II
DMP: Deterministic Shared Memory
Multiprocessing
Joseph Devietti et al.
Presenter : Tae Jun Ham
2012. 3. 19
Abstract

-

-

Most current shared memory multicore and
multiprocessor systems are nondeterministic.
Non-determinism makes debugging and testing hard.
Previous approaches were based on replay
But replay is only useful for debugging
Based on deterministic inter-thread communication, this
paper suggests several ways to achieve deterministic
shared memory multiprocessing
Determinism

What is Deterministic Parallel Execution?
-
Executes multiple threads that communicate via shared memory
Should produce the same output if given the same program input

What causes Non-determinism?
-
Software sources : concurrent threads, the state of memory
pages, power saving mode, disk and I/O buffer, and some OS
system calls.
Hardware sources : state of caches, predictor tables and bus
priority controller, and bus arbiters. In other words, almost all
microarchitectural structures.
-
Non-determinism
Non-determinism
DMP-Serial (Fine-Grained)
DMP-Serial (QBcount)
DMP-Serial (Coarse-Grained)
DMP-ShTab



-
Communication-Free Region: Parallel
Communication : Serial
Rules
Without token:
Read for shared address
Write for own address
-
With token:
Can do everything
DMP-ShTab
DMP-ShTab
DMP-TM & DMP-TM-Fwd
DMP-TM & DMP-TM-Fwd
DMP-TM & DMP-TM-Fwd
QB-SyncFollow & QB-Sharing
QB SyncFollow :
After unlock, pass
the token
QB Sharing : After
finishing works on
shared data, pass
the token
Evaluation - Performance
Serial : Linear slowdown with the increasing number of threads
ShTab : 38% TM-Fwd : 21%
Evaluation - Quanta size sensitivity
In general, larger quanta is slower. Serial case is less sensitive to quanta size.
Evaluation - Heuristics on quanta size
Effective for ShTab. SyncFollow benefits for some workloads.
Evaluation - Sw-DMP
Author says : In summary, this data shows that Sw-DMP-ShTab does not
unduly limit performance scalability for multithreaded applications.
Discussions

Can this system deployed?
-
Too much performance overhead
Implementation Complexity
-

Which one do you prefer?
DMP vs Deterministic replay

Possible power saving with DVFS?