Transcript slides

Asynchronous
Processor Design
for ELEC 6200
by
Wei Jiang
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
1
Why Asynchronous Design

Higher Performance




Better Power Efficiency





No global clock
proceed data at appropriate rate of environment
Do not propagate local delay globally
Only activating functional units consume power
Inactivated parts remain in “stand-by” state
No wasteful power dissipation by glitches
Smaller Chip Size
Less high-frequency EMI components due to small
amplitude and wide current peaks
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
2
Sync vs. Async Pipeline

Asynchronous Pipeline:


No global clock
No delay for clock transition
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
3
Dynamic Voltage Control


Power supply is controlled by measuring the occupancy
of input FIFO
As FIFO fills the supply voltage will increase and the
processor will operate faster to empty the FIFO
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
4
Asynchronous Circuit Design

Level-sensitive/Fourphase signaling protocol


Transition-sensitive/twophase protocol:




ELEC 6200, Fall 07, Oct 24
simpler design
Jiang: Async. Processor
no “nonactive” state
fewer transitions
higher performance
lower power dissipation
5
Asynchronous Interface



ELEC 6200, Fall 07, Oct 24
Two-phase Pipeline handshake
protocol
Sender to ensure that all bits of the data
bundle are valid prior to sending the
Request event (local delay management);
Receiver to accept the data bundle prior
to sending its Acknowledge which will
permit the Sender to remove the old
data and place a new value on the data
lines
Jiang: Async. Processor
6
Event Driven Pipeline



Components must respond to input transitions
(events) rather than logic levels
Muller C-gate: outputs a transition only when all
inputs have experienced transitions changing
their input levels
Storage elements must respond identically to
rising or falling transitions
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
7
Asynchronous Storage

Capture-Pass latch





Event on Capture line causes
latch to hold input data
Capture-done event indicates
completion of capture
operation
Event on Pass line cause latch
to return transparent state
Pass-done event indicates
completion of pass operation
Interconnect Capture-Pass
latches using Muller C-gates
to form Pipeline
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
8
Prevention of Hazards

Data Hazards




An instruction depends on the results of a previous
instruction still in the pipeline
Prevented by Register Locking mechanism:
stall the instruction until write-back of register takes place
Pipeline must support multiple asynchronous read/write
operations
Control Hazards


The instructions after branch instruction are loaded into
pipeline but may not be executed
Prevented by Delayed Branch mechanism:
the next instructions is always executed
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
9
Example: AMULET1
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
10
Example: Asynchronous DLX
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
11
Conclusion

Asynchronous design is competitive with the
best synchronous design in power efficiency and
is close in performance and silicon area

References



SCALP: A Superscalar Asynchronous Low-Power Processor, Philip Brian
Endecott, Ph.D thesis of University of Manchester, UK, 1996
AMULET1: An Asynchronous ARM Microprocessor, J. V. Woods et al,
IEEE Trans. on Computers, Vol.46.4, p.385-398, Apr 1997
Automating the Design of an Asynchronous DLX Microprocessor,
Manish Amde et al, DAC’03, p.502-507, June 2003
ELEC 6200, Fall 07, Oct 24
Jiang: Async. Processor
12