Michael P. Frank http://www.eng.fsu.edu/~mpf Requirements for Practical Reversible Computing Michael P. Frank Solid State Seminar, Notre Dame Tuesday, April 19, 2005 Host: Craig Lent.

Download Report

Transcript Michael P. Frank http://www.eng.fsu.edu/~mpf Requirements for Practical Reversible Computing Michael P. Frank Solid State Seminar, Notre Dame Tuesday, April 19, 2005 Host: Craig Lent.

Michael P. Frank
http://www.eng.fsu.edu/~mpf
Requirements for Practical
Reversible Computing
Michael P. Frank
Solid State Seminar, Notre Dame
Tuesday, April 19, 2005
Host: Craig Lent
Abstract of Talk
• I’ll survey requirements for energy-efficient
computing beyond the limits of traditional
(“irreversible”) computing technologies.
– We’ll discuss requirements on devices, logic, and on
mechanisms for driving & synchronizing the logic.
• Outline of talk:
– Brief introduction
– Some important device-level figures of merit:
• Energy & entropy coefficients, device cost, speed
– I’ll also discuss limits on some of these
– Logic-level requirements for reversibility:
• Not as stringent as traditionally depicted!
– I’ll show several ways to generalize the requirements.
– Power/clock mechanisms:
• Requirements and major challenges
• A call to Action!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
2
Introduction
• The Importance of Energy
Efficiency
• Limits to Energy Efficiency in
Conventional Computing
• Reversible Computing to the
Rescue!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
3
What is Efficiency?
• The efficiency η of a process that consumes valued
resource R and produces valued product P is the
ratio between the amount of product produced, and
the amount of resource consumed: η = Pprod/Rcons.
– Example 1: A heat engine “consumes” (which in this case,
means “degrades”) an amount Q of high-temperature heat,
and produces an amount W of work.
• The heat engine’s efficiency is thus ηh.e. = W/Q. (Dimensionless.)
– Carnot showed that ηh.e. ≤ (TH − TL)/TH.
– Example 2: A computer consumes an amount Econs of free
energy, and performs Nops useful computational operations
(produces Nops operations worth of computational “effort”).
• The computer’s (energy) efficiency is thus ηE,comp = Nops/Econs.
– Units: Operations per unit energy, or ops/sec/watt.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
4
Energy Efficiency Limits
Cost Efficiency!
• Of course, there are other economically valuable resources
besides energy that are consumed in computing…
– Manufacturing/operating costs, opportunity costs, etc.
• But, the total cost ¢ of a process obviously can never be less
than the cost ¢E of the energy used!
– Thus, cost-efficiency FC = Nops/¢ is limited to be at most Nops/¢E,
• or, at best proportional to the energy efficiency ηE = Nops/E.
•  Greatly improving cost-efficiency requires improving energy
efficiency, when energy-related costs are significant!
– The direct and indirect costs of energy have always been nonnegligible contributors to total operating costs in computing.
• The many orders-of-magnitude improvement in computer cost-efficiency
over the last 50 years has only been possible because of energy efficiency
improvements of comparable magnitude!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
5
Lower Bounds on
Energy Dissipation
• In today’s 90 nm VLSI technology, for minimal operations
(e.g., conventional switching of a minimum-sized transistor):
– Ediss,op is on the order of 1 fJ (femtojoule)  ηE ≲ 1015 ops/sec/watt.
• Will be a bit better in coming technologies (65 nm, maybe 45 nm)
• Conventional digital technologies are subject to several lower
bounds on their energy dissipation Ediss,op for digital logic /
storage / communication operations,
– And thus, corresponding upper bounds on their energy efficiency.
• Some of the known bounds include:
– Leakage-based limit for high-performance field-effect transistors:
• Perhaps roughly ~5 aJ (attojoules)  ηE ≲ 2×1017 operations/sec/watt
– Reliability-based limit for all non-energy-recovering technologies:
• Roughly 1 eV (electron-volt)  ηE ≲ 6×1018 operations/sec/watt
– von Neumann-Landauer (VNL) bound for all irreversible technologies:
• Exactly kT ln 2 ≈ 18 meV  ηE ≲ 3.5×1020 operations/sec/watt
– For systems whose waste heat ultimately winds up in Earth’s atmosphere,
» i.e., at temperature T ≈ Troom = 300 K.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
6
1.E-14
Gate Energy
Trends
Trend of ITRS
Min.'97-'03
Transistor
Switching
Energy
Based on ITRS ’97-03 roadmaps
250
180
1.E-15
130
90
Node numbers
(nm DRAM hp)
65
1.E-16
CVV/2 energy, J
LP min gate energy, aJ
HP min gate energy, aJ
100 k(300 K)
ln(2) k(300 K)
1 eV
k(300 K)
45
32
1.E-17
fJ
22
Practical limit for CMOS?
1.E-18
aJ
Room-temperature 100 kT reliability limit
One electron volt
1.E-19
1.E-20
Room-temperature kT thermal energy
Room-temperature von Neumann - Landauer limit
zJ
1.E-21
1.E-22
1995
2000
2005
2010
2015
2020
2025
2030
2035
2040
2045
Year
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
7
Reliability Bound on Logic
Signal Energies
• Let Esig denote the logic signal energy,
– The energy involved in storing, transmitting, or transforming a bit’s worth of
digital information.
• But note that “involved” does not necessarily mean “dissipated!”
• As a result of fundamental thermodynamic considerations, it is required
that Esig ≥ kBTsig ln R,
– Where kB is Boltzmann’s constant, 1.38×10−12 J/K;
– and Tsig is the temperature of the local subsystem carrying the signal;
– and R is the reliability factor, i.e., the improbability 1/perr of error.
• In non-energy-recovering logic technologies (totally dominant today)
– Basically all of the signal energy is dissipated to heat on each operation.
• And often additional energy (e.g., short-circuit power) as well.
• In this case, minimum sustainable dissipation is Ediss,op ≳ kBTenv ln R,
– Where Tenv is now the temperature of the waste-heat reservoir
• Averages around 300 K (room temperature) in Earth’s atmosphere
• For a decent R = 2×1017, this energy is ~40 kT ≈ 1 eV.
–  For energy efficiency > 1 op/eV, we must recover some of the signal energy.
• Rather than dissipating it all to heat with each manipulation of the signal.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
8
Von Neumann-Landauer Bound
• Follows directly from the time-reversibility (invertibility) of all
fundamental physical dynamics.
– This in turn is implied by the Hamiltonian formulation of mechanics;
and the unitarity of quantum mechanics.  Very well-established.
• Implies that physical information can never be destroyed!
– Only reversibly (mathematically invertibly) transformed!
• When we lose or discard a bit’s worth of logical information,
– e.g., by erasing or destructively overwriting a bit storage location…
• the ‘lost’ information must actually remain in existence,
– if in no other form, then as a bit’s worth (k ln 2) of physical entropy.
• Entropy simply means unknown information in the physical state.
• If the logical bit was originally known (not entropy)
– then entropy has increased in this process by ∆S = 1 bit = k ln 2.
• The energy in the heat reservoir must be increased by an amount ∆S·Tenv
= kTenv ln 2 in order to contain this additional entropy.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
9
VNL Bound on Energy Dissipation
from Information Loss
N physical microstates
per logical macrostate
before bit erasure
(shown as 8 for clarity
in this simple example)
Physical
microstate
trajectories
Follows directly from the reversibility
of fundamental physics!
Logical
state “0”,
after
operation
S = k ln 8
= 3 bits
S = k ln 16
= 4 bits
Logical
state “0”,
before
operation
∆S = 1 bit
= k ln 2
Logical
state “1”,
before
operation
11/7/2015
S = k ln 8
= 3 bits
Ediss = ∆S·Tenv
= kTenv ln 2
M. Frank, "Requirements for Practical Reversible Computing"
10
Reversible Computing
• The basic idea is simply this:
– Don’t erase information when performing logic / storage /
communication operations!
• Instead, just reversibly transform it in place!
• When reversible digital operations are implemented
using well-designed energy-recovering circuitry,
– This can result in local energy dissipation Ediss << Esig,
• has been empirically demonstrated by many groups.
– and even (in principle) energy dissipation Ediss << kT ln 2!
• This has been shown in theory, but we are not yet to the point of
demonstrating such low levels of dissipation experimentally.
– Achieving this goal requires very careful design,
– and verifying it requires very sensitive measurement equipment.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
11
Device-Level Requirements for
Reversible Computing
• A good reversible device technology should have:
– Low manufacturing cost ¢d per device
• Important for good overall cost-efficiency
– Low rate of static power dissipation Pleak due to energy leakage.
• Required for energy-efficient storage
– Low energy coefficient cE = Ediss/f (energy dissipated per operation per unit
transition frequency) for adiabatic transitions.
• Implies we can achieve a high operating frequency (and thus good costperformance) at a given level of energy efficiency.
– High maximum transition frequency fmax.
• Important for those applications in which latency of serial computations dominates
total cost
• Important: For system-level energy efficiency, Pmin and cE must be taken
as effective global values measuring the implied amount of energy emitted
into the outside environment at temperature Tenv.
– With an ideal (Carnot) refrigerator, Pmin = StTenv and cE = cSTenv,
• Where St = the static rate of leakage entropy generation per unit time,
• and cS = Sgen/f adiabatic entropy coefficient, or entropy generated per unit transition
frequency.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
12
Energy & Entropy Coefficients
in Electronics
• For a transition involving the adiabatic transfer of an
amount Q of charge along a path with resistance R:
– The raw (local) energy coefficient is given by
cE = Edisst = Pdisst2 = IVt2 = I2Rt2 = Q2R.
• Where V is the voltage drop along the path.
– The entropy coefficient cS = Q2R/Tpath.
• where Tpath is the local thermodynamic temperature in the path.
– Effective (global) energy coefficient cE,eff = Q2R(Tenv/Tpath).
• The cE of a simple adiabatic circuit in a recent 180
nm technology (measured in a Cadence simulation)
was ~80 eV/GHz.
– Corresponds to a Q per charged-up transistor gate of on
the order of 6,000 electrons.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
13
Limiting Cases of
Energy/Entropy Coefficients
• Entropy/entropy coefficients in adiabatic “single electronics:”
– Suppose the amount of charge moved |Q| = q (a single electron)
– Let the path consist of a single quantum channel (chain of states)
• Has quantum resistance R = R0 = 1/G0 = h/2q2 = 12.9 kΩ.
– Then cE = h/2 = 2.07 meV/THz (very low!)
• If path is at Tpath = Troom = 300 K, then cS = 0.08 k/THz.
– For N× better efficiency than this, let the path consist of N parallel
quantum channels.  N× lower resistance.
• What about systems where resistive models may not apply?
– E.g., superconductors, photonics, etc.
• A more general and rigorous (but perhaps loose) lower bound
on the energy coefficient in all adiabatic quantum systems is
given by the expression cE ≥ h2/4Egt,
– where Eg = energy gap between ground & excited states,
– and t = time taken for a single orthogonalizing transition
– Ex.: Let Eg = 1 eV, t = 1 ps. Then cE ≥ 4.28 μeV/THz.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
14
Logic-Level Requirements for
Reversible Computing
• A traditional logical “requirement” for
thermodynamically reversible logic:
– All local n-bit operations must carry out a 1-to-1 (bijective)
transformation on the space of all 2n possible inputs.
• Strictly speaking, this is false!
– It is actually quite a bit more restrictive than necessary.
• Avoiding Landauer’s principle only requires:
– The number of states in the possible set (consistent with
our design knowledge) must not decrease.
• But many-to-many, not just 1-to-1 transistions may be used.
– Further, this is only required to be true on average.
• E.g., it is OK to erase previously nondeterministically obtained bits!
– Finally, it is only required to be true on states encountered,
• Not necessarily on the space of all 2n describable inputs!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
15
Non-Injective Operations Can Be
Thermodynamically Reversible
• For example, consider
(Circles
the “operation”
contain
illustrated at right.
state
– 3 initial states
• all equally likely
probabilities.)
– 3 final states
• Transition relation is not an injective function,
– but a many-to-many relation (may have weighted arcs)
• As long as the transition probabilities have semidetailed balance (a: ∑b p(ba) = 1),
initially uniform distributions will stay that way!
– No increase in entropy, if initial state is unknown.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
16
Reversible Computations Can Even
Contain Many-to-One Operations
• As long as operations are still N-to-N on average!
1/2
1
1
1/2
(Circles
contain
state
probabilities.)
• E.g., in the pictured computation, we first
nondeterministically randomize a known bit
– Extracting 1 bit of entropy from the environment
• then later, we erase this bit,
– Returning the bit of entropy to the environment.
• Total entropy need not increase in such a process!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
17
We are even free to permanently
compress parts of the state space…
• As long as the subset of states that actually
arise is not compressed!
0
• E.g., at right, the operation takes
0
0
the top two initial states to the
1
same final state…
1
– But we design the system in such
a way that those two states never arise!
• Note the state that can arise has
a unique successor…
(Circles
contain
state
probabilities.)
– More generally, its “equivalent set” (set of
equivalent states) must not be compressed.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
18
Pop Quiz: Can This Machine Be
Thermodynamically Reversible?
• Suppose the transition
relation between digital
states is as shown.
– Outgoing arcs are chosen
with equal weight.
A
B
• Subset A of initial states
is guaranteed, by
design, never to arise.
– States in subsets B and C
may arise, but the
particular state within a
given subset is completely
random.
11/7/2015
C
Answer: Yes, in fact, running this
machine can (temporarily) decrease
the entropy of the environment!
M. Frank, "Requirements for Practical Reversible Computing"
19
Why is all of this Useful?
• The fact that only N-to-N (not 1-to-1) ops are required is
useful because:
– We can encode known information using “equivalent sets” of lowerlevel states whose transitions are treated as noninjective (and
nondeterministic).
• That is, we don’t have to track the complete microstate.
• The fact that transitions only need to be N-to-N on average is
useful because:
– It allows us to execute randomized algorithms,
• and dispose of the random numbers later, when we no longer need them.
• The fact that only the possible subset needs to be reversible
is useful because:
– It allows us to build fully-reversible machines out of easily-implemented
logic devices that are only conditionally reversible.
• That is, that are reversible only if certain design rules are followed.
– The resulting designs can be much simpler!
• Compared to building everything from Toffoli and Fredkin gates.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
20
Other Misconceptions To Avoid
in Reversible Logic Designs
• Be aware that quantum and reversible “logic networks” (timesequences of operations) are not the same thing as hardware
diagrams!
– It’s generally a bad idea to try to use one directly as the other.
• Please always take care to distinguish between logic
operations and logic gates.
– Operations are transformations of part of the logical state,
• and their “inputs” and “outputs” are really just the “before” and “after”
configurations of the local state.
– Gates are physical devices (hardware) that can implement one or more
operations on their set of impinging wires (I/O signals).
• For hardware, an “input” means a wire that affects the gate’s behavior,
• and an “output” means a wire that the gate’s behavior affects.
• A gate may use some signal wires as both inputs and outputs!
– E.g., a reversible operation depicted as having 3 inputs and 3 outputs
can be implemented by a physical gate that is attached to a total of
only 3 wires.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
21
What’s the Simplest Universal
Reversible Logic Gate?
• Where “simplest” here refers to number of
data signals operated on…
– Guess what: It isn’t the Fredkin or Toffoli gate…
• And it isn’t any of the fully-reversible gates!
• Rather it’s a conditionally reversible gate…
– I call it the reversible buffer, or crSET/crCLR gate:
• It involves only 2 data signals:
– 1 input, and 1 output (can be tristated)
• Some implementations use only 2 CMOS transistors!
– Together with latches, we can efficiently build
arbitrary reversible logic with it.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
22
Reversible Buffers and Latches
• A universal set of conditionally reversible operations:
– crSET(a,b): Controlled Reversible SET.
• Semantics: (ab = 0) if a then b := 1, else if b then unlock(a), else lock(a)
– If a is 1, then set b to 1 (else leave b alone).
» Reversible on condition that a and b are not both 1 (& locks are obeyed).
– crCLR(a,b): Controlled Reversible CLR.
• A.k.a. crUnSET – It’s crSET in reverse.
• Semantics: (ab = 0) if a then b := 0, else if b then lock(a), else unlock(a)
– If a is 1, then set b to 0.
» Reversible if we don’t have a = 1, b = 0 (& locks are obeyed)
– rLatch(a,b): Reversible latch operation.
• Semantics: a =/= b
– Meaning, break the connection a from b through this particular latch HW.
– rUnLatch(a,b): Reversible “unlatch” operation.
• Semantics: (a = b) a == b
– Meaning, connect a to b through a particular bit of latching HW.
» Reversible on condition that a = b initially.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
23
CMOS Gate Implementating
crSET & crCLR
• Reversible Buffer (does crSET & crCLR)
Implementation
Icon
Spacetime Diagram
drive
in
inNP
2
(CMOS
transmission
gate)
out
in
inP
in
2
11/7/2015
out
out
0
inN
out
0
(in)
time
drive
inNP
crCLR
or
drive
out
crSET
• Double the hardware to get a dual-rail output
• Can show timing control signal “drive” on icon
• Special notation in spacetime diagram is used
to keep track of constraints on nodes.
M. Frank, "Requirements for Practical Reversible Computing"
24
CMOS Gate Implementing
rLatch/rUnLatch
• Symmetric Reversible Latch
Implementation
Icon
Spacetime Diagram
crLatch
connect
in
2
in
mem
mem
crUnLatch
in
or
connect
in
mem
mem
(in)
time
• Just a transmission gate again
• This time controlled by a clock, with the data signal driving
• Concise, symmetric hardware icon – Just a short orthogonal line
• Thin strapping lines denote connection in spacetime diagram.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
25
Example:
Building cNOT from rlXOR
• rlXOR(a,b,c): Reversible latched XOR.
– Semantics: (c = N) c := ab.
• Given that c is initially in a predefined “neutral” or “no
information” state N, set c to the value (a XOR b).
– Easy to implement with transistors (or in QCA)
• cNOT(a,b): Controlled-NOT operation.
– Semantics: b := ab. (No preconditions.)
• A popular “primitive” in reversible & quantum comp.
• Complex to implement in hardware
– Not a very good building block for practical hardware!
– But we can build it, if we really want to.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
26
cNOT from rlXOR:
Hardware Diagram
• A logic block implementing an in-place cNOT
operation (a cNOT “gate”) can be constructed
from 2 rlXOR gates and two latched buffers.
A
B
Reversible
latches
X
• The key is:
– Operate some of the gates in reverse!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
27
Simulation Results from Cadence
Power vs. freq., TSMC 0.18, Std. CMOS vs. 2LAL
1.E-05
1.E-07
1.E-08
Standard
CMOS
1.E-10
1.E-11
1.E-12
<.01× the power
@ 1 MHz
1.E-09
>100× faster
@ 1 pW/T
1.E-13
1.E-14
1.E+09 1.E+08 1.E+07 1.E+06 1.E+05 1.E+04 1.E+03
11/7/2015
Energy dissipated per nFET per cycle
Average power dissipation per nFET, W
1.E-06
Assumptions & caveats:
•Assumes ideal trapezoidal
power/clock waveform.
• Minimum-sized devices, 2λ×3λ
* .18 µm (L) × .24 µm (W)
• nFET data is shown
* pFETs data is very similar
• Various body biases tried
* Higher Vth suppresses leakage
• Room temperature operation.
• Interconnect parasitics have not
yet been included.
• Activity factor (transitions per
device-cycle) is 1 for CMOS,
0.5 for 2LAL in this graph.
• Hardware overhead from fullyadiabatic design style is not
yet reflected
* ≥2× transistor-tick hardware
overhead in known reversible
CMOS design styles
Frequency,
Hz
M. Frank,
"Requirements
for Practical Reversible Computing"
28
O(log n)-time carry-skip adder
With this structure, we can do a
(8 bit segment shown)
2n-bit add in 2(n+1) logic levels
→ 4(n+1) reversible ticks
rd
3 carry tick
2nd carry tick
→ n+1 clock cycles.
4th carry tick
Hardware
overhead is
<2× regular
P
G P
P
G P
P
G P
P
G P
ripple-carry.
MS
MS
LS
LS
G
G
GC
C
GC
C
S AB
G
S AB
Cin
GCoutCin
P
ms
S AB
G
P
ls
S AB
GCoutCin
Cin
P
ls
G
P
ms
ls
out
P
S AB
P
ls
ms
G
Gls
S AB
GCoutCin
Cin
G
P
ls
S AB
Cin
GCoutCin
P
ls
P
ms
in
ls
out
P
Pms
S AB
P
Pls
Gls
GCout LS
P
in
P
Pms
MS
ls
Pls
Cin
P
Pms
Gls
GCout LS
Pls
Cin
P
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
29
32-bit Adder Simulation Results
32-bit adder power vs.
frequency
32-bit adder energy vs.
frequency
1.E-04
1.E-11
Energy/Add (J)
1.E-05
Power (W)
1.E-06
1.E-07
1.E-12
1V CMOS
0.5V CMOS
1.E-13
1.E-14
CMOS energy
1.E-08
Adia. enrgy
20x better perf.
@ 3 nW/adder
CMOS pwr
1.E-09
1.E-15
1.E+08
Adia. pwr
1.E+07
1.E+06
1.E+05
1.E+04
Add Frequency (Hz)
1.E-10
1.E+08
1.E+07
1.E+06
1.E+05
Add Frequency (Hz)
11/7/2015
1.E+04
(All results normalized to a
throughput level of 1 add/cycle)
M. Frank, "Requirements for Practical Reversible Computing"
30
Power vs. freq., alt. device techs.
Power per device, vs. frequency
Plenty of Room for
Device Improvement
1.E-03
1.E-04
1.E-05
1.E-06
1.E-07
• Recall, irreversible device
technology has at most ~34 orders of magnitude of
power-performance
improvements remaining.
1.E-08
1.E-09
1.E-10
1.E-11
1.E-12
1.E-13
– And then, the firm kT ln 2 limit
is encountered.
1.E-15
1.E-16
1.E-17
• But, a wide variety of
proposed reversible device
technologies have been
analyzed by physicists.
1.E-18
1.E-19
1.E-20
1.E-21
.18um 2LAL
nSQUID
QCA cell
Quantum FET
Rod logic
Param. quantron
Helical logic
.18um CMOS
kT ln 2
– With theoretical powerperformance up to 10-12
orders of magnitude better
than today’s CMOS!
• Ultimate limits are unclear.
1.E+12
11/7/2015
Power per device (W)
1.E-14
1.E+11
1.E+10
1.E+09
1.E-22
1.E-23
1.E-24
Various
reversible
device proposals
1.E-25
1.E-26
1.E-27
1.E-28
1.E-29
1.E-30
1.E+08
1.E+07
Frequency (Hz)
M. Frank, "Requirements for Practical Reversible Computing"
1.E+06
1.E+05
1.E+04
1.E-31
1.E+03
31
Requirements for EnergyRecovering Clock/Power Supplies
• All known reversible computing schemes require a periodic
global signal that synchronizes and drives adiabatic
transitions.
– For good system-level energy efficiency, this signal must oscillate
resonantly and near-ballistically, with a high effective quality factor.
• Several factors make the design of a satisfactory resonator
quite difficult:
– Need to avoid uncompensated back-action of logic on resonator
– In some resonators, Q factor may scale unfavorably with size
– Effective quality factor problem
• I’m not saying it’s impossible…
– But it’s definitely a nontrivial hurdle, that we need to face up to, pretty
urgently…
• If we want to convince people that reversible computing will work.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
32
The Back-Action Problem
• The ideal resonator signal is a pure periodic signal.
– A pretty general result from communications theory:
• A resonator’s quality factor is inversely proportional to its signal bandwidth B.
– E.g., for an EM cavity w. resonant frequency ω0,
• the half-maximum BW is B = ∆ω = ω0/(2πQ) [1].
– Thus Q∞  B  0.
• There must be little or no information in the resonator signal!
• However, if the logic load being driven varies from on cycle to the next,
– whether due to data-dependent variations,
– or structural variations (different amounts of logic being driven per cycle)
• this will tend to produce impedance nonuniformities, which will lead to
nonuniform reflections of the resonator signal
– and thereby introduce nonzero bandwidth into that signal.
• Even more generally, any departure of resonator energy away from an
ideal desired trajectory represents a form of effective energy dissipation!
– we must control exactly where (into what states) all the energy goes
• the set of possible microstates of the system must not grow quickly
[1] Schwartz, Principles of Electrodynamics, Dover, 1972.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
33
Unfavorable Scaling of Resonator
Quality Factor with Size?
• I don’t yet have a perfectly clear and general
understanding of this issue, but…
– In a lot of oscillator systems I’ve looked at, the resonator Q
factor may tend to get worse (or at least, not much better)
as the resonator gets smaller.
• In LC oscillators, inductor Q scales inversely to frequency
– EM emission is greater at high frequencies
– But, the tendency is for low f  large coil sizes
• Anecdotal reports from people working in NEMS community…
– Difficult to get high Q in nanoscale electromechanical resonators
» Perhaps due to difficulty of precision engineering at nanoscale?
• Our own experience working with transmission-line resonators
• Example: In a cubical EM cavity of size L,
– We have 2πQ = L / 8δ, where δ = skin depth. ([1] again)
• Skin depth δ = (2πσk)−1/2, where σ = wall conductivity, k = wave #.
– So if L is fixed, high Q  small δ  large k  high f  low Q in logic!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
34
The Effective Quality Factor
Problem
• Actual quality factor of resonator Q = Eres/Edissr.
– Where Eres = energy contained in resonator signal
– and Edissr = energy dissipated in resonator per cycle.
• But the effective quality factor, for purposes of doing
energy-efficient logic transitions is Qeff = Edeliv/Edissr.
– Where Edeliv = energy delivered to the logic per transition.
• Since 1/Qeff of the logic signal energy is dissipated per cycle.
• Thus, Qeff = Q · (Edeliv/Eres).
– That is, the effective Q is taken down by the fraction of
resonator energy delivered to the logic per cycle.
• If a resonator needs to be large to attain high Q,
– it may also hold a large amount of energy Eres,
• and so it may not have a very high effective Q for driving the logic!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
35
MEMS (& NEMS) Resonators
• State of the art of technology demonstrated in lab:
– Frequencies up to the 100s of MHz, even GHz
– Q’s >10,000 in vacuum, several thousand even in air!
• An important emerging technology being explored
for use in RF filters,
U. Mich., poly, f=156 MHz, Q=9,400
etc., in
communications
SoCs, e.g. for
34 µm
cellphones.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
36
PATENT PENDING
Original Concept
• Imagine a set of charged plates whose horizontal position oscillates
between two sets of interdigitated fixed plates.
– Structure forms a variable capacitor and voltage divider with the load.
• Capacitance changes substantially only when crossing border.
– Produces nearly flat-topped (quasi-trapezoidal) output waveforms.
– The two output signals have opposite phases (2 of the 4 φ’s in 2LAL)
Logic
load #2
Logic
load #1
V1
RL
CL
V2
CL
x
t
V1
t
11/7/2015
RL
V2
M. Frank, "Requirements for Practical Reversible Computing"
t
37
PATENT PENDING
Resonator Schematic
Vc
vac
Actuator
Vb
Vc
vac
Ca
Sensor
Sensor
Cs
Cr
Vb
Sensor
Vc
Sensor
 vac
Actuator
11/7/2015
Vp  Vc  Vb
M. Frank, "Requirements for Practical Reversible Computing"
38
PATENT PENDING
New Comb Finger Shape IV
Arm anchored to nodal points of fixed-fixed beam flexures,
located a little ways away, in both directions (for symmetry)
Moving metal plate support arm/electrode
Moving
plate Range of Motion
z
Phase 0° electrode
C(θ)
0°
θ
11/7/2015
360°
Repeat
interdigitated
structure
arbitrarily many
times along y axis,
all anchored to the
same flexure
Phase 180° electrode
x
C(θ)
0°
θ
M. Frank, "Requirements for Practical Reversible Computing"
y
360°
39
PATENT PENDING
Another Candidate Layout
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
40
New simulation results
8
7
6
5
4
3
2
1
0
0
1
2
3
4
5
6
7
8
9
8
7
6
5
4
3
2
1
0
0
11/7/2015
2
4
6
8
10
M. Frank, "Requirements for Practical Reversible Computing"
12
41
DRIE CMOS-MEMS Resonators
Front-side
view
Serpentine
Proof
spring
mass
Comb
drive
Back-side
view
150 kHz
Resonators
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
42
PATENT PENDING
Post-TSMC35 AdiaMEMS Resonator
Taped out
April ‘04
Drive
comb
Sense
comb
Flex
arm
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
43
A Challenge for Our Community
• I predict that the field’s critics will never be silenced
by theory and simulations alone…
– To prove to the world that reversible computing can really
work will require a complete empirical demonstration.
• We also cannot afford to sweep resonator-related
difficulties under the rug…
– A convincing demonstration of low total system power must
be completely self-contained, including the resonator.
• with only DC power input as needed to keep it running
• My challenge to us:
– Let’s work together to fabricate and empirically
demonstrate (for starters) an N-bit binary counter that
measurably dissipates less than some small multiple of kT
energy per cycle in a room-T environment
• “Wall-plug” power, as our critics like to put it.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
44
Why a Binary Counter?
• The number of bits that must flip varies dramatically
from cycle to cycle.
– Usually just 1 or 2, sometimes as many as N.
• The average number is small, however…
– conventional irreversible solutions would need to dissipate only a
small multiple of the bit energy per cycle on average.
– Data-dependent: Depends on initial state of the counter.
• The resonator system cannot “know” what the counter state is
• As a result, the physical action required to carry out
each cycle is non-uniform, and data-dependent.
– Implies that either the energy supplied is non-uniform, or
the time taken per cycle is non-uniform.
• Either one poses challenges for resonator design.
• I believe this goal is already quite difficult,
– but is a good stepping-stone towards a full reversible CPU.
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
45
Conclusion
• Reversible computing is a prerequisite for getting
beyond the next decade or so of improvements in
computer energy efficiency.
– This is rigorously implied by fundamental physics!
• Practical reversible computing requires:
– Devices with very low energy coefficients cE…
• e.g., Notre Dame’s own Quantum-dot Cellular Automata
– Logic design that is somewhat constrained
• though not as much as people used to think!
– Very high quality power/clock resonator systems
• this is, I think, by far the most difficult part to achieve
• Let’s work together to tackle the engineering
challenges and convincingly demonstrate this new
paradigm for 21st-century computing!
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
46
The 1st International Workshop on
Reversible Computing (RC’05)
• A special session in the
ACM Computing Frontiers
conference (CF’05).
– To be held in Ischia, Italy,
May 4-6, 2005.
• Speakers include:
– Averin, Bennett, DeBenedictis, Forsberg, Frank, Fredkin,
Frost, Semenov, Toffoli, Vitanyi… (& others)
• Workshop website:
– http://www.eng.fsu.edu/~mpf/CF05/RC05.htm
11/7/2015
M. Frank, "Requirements for Practical Reversible Computing"
47