Transcript slides
Asynchronous Processor Design for ELEC 6200 by Wei Jiang ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Why Asynchronous Design Higher Performance Better Power Efficiency No global clock proceed data at appropriate rate of environment Do not propagate local delay globally Only activating functional units consume power Inactivated parts remain in “stand-by” state No wasteful power dissipation by glitches Smaller Chip Size Less high-frequency EMI components due to small amplitude and wide current peaks ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 2 Sync vs. Async Pipeline Asynchronous Pipeline: No global clock No delay for clock transition ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 3 Dynamic Voltage Control Power supply is controlled by measuring the occupancy of input FIFO As FIFO fills the supply voltage will increase and the processor will operate faster to empty the FIFO ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 4 Asynchronous Circuit Design Level-sensitive/Fourphase signaling protocol Transition-sensitive/twophase protocol: ELEC 6200, Fall 07, Oct 24 simpler design Jiang: Async. Processor no “nonactive” state fewer transitions higher performance lower power dissipation 5 Asynchronous Interface ELEC 6200, Fall 07, Oct 24 Two-phase Pipeline handshake protocol Sender to ensure that all bits of the data bundle are valid prior to sending the Request event (local delay management); Receiver to accept the data bundle prior to sending its Acknowledge which will permit the Sender to remove the old data and place a new value on the data lines Jiang: Async. Processor 6 Event Driven Pipeline Components must respond to input transitions (events) rather than logic levels Muller C-gate: outputs a transition only when all inputs have experienced transitions changing their input levels Storage elements must respond identically to rising or falling transitions ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 7 Asynchronous Storage Capture-Pass latch Event on Capture line causes latch to hold input data Capture-done event indicates completion of capture operation Event on Pass line cause latch to return transparent state Pass-done event indicates completion of pass operation Interconnect Capture-Pass latches using Muller C-gates to form Pipeline ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 8 Prevention of Hazards Data Hazards An instruction depends on the results of a previous instruction still in the pipeline Prevented by Register Locking mechanism: stall the instruction until write-back of register takes place Pipeline must support multiple asynchronous read/write operations Control Hazards The instructions after branch instruction are loaded into pipeline but may not be executed Prevented by Delayed Branch mechanism: the next instructions is always executed ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 9 Example: AMULET1 ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 10 Example: Asynchronous DLX ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 11 Conclusion Asynchronous design is competitive with the best synchronous design in power efficiency and is close in performance and silicon area References SCALP: A Superscalar Asynchronous Low-Power Processor, Philip Brian Endecott, Ph.D thesis of University of Manchester, UK, 1996 AMULET1: An Asynchronous ARM Microprocessor, J. V. Woods et al, IEEE Trans. on Computers, Vol.46.4, p.385-398, Apr 1997 Automating the Design of an Asynchronous DLX Microprocessor, Manish Amde et al, DAC’03, p.502-507, June 2003 ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 12