Backing off from infinity

Transcript Backing off from infinity

fundamental communication limits
in non-asymptotic regimes
Andrea Goldsmith
Thanks to collaborators Chen, Eldar,
Grover, Mirghaderi, Weissman
Information Theory and Asymptopia
 Capacity with asymptotically small error
achieved by asymptotically long codes.
 Defining capacity in terms of asymptotically
small error and infinite delay is brilliant!
 Has also been limiting
 Cause of unconsummated union between networks
and information theory
 Optimal compression based on properties of
asymptotically long sequences
 Leads to optimality of separation
 Other forms of asymptopia
 Infinite SNR, energy, sampling, precision, feedback, …
Why back off?
Theory not informing practice
Theory vs. practice
Theory
Infinite blocklength codes
Infinite SNR
Infinite energy
Infinite feedback
Infinite sampling rates
Infinite (free) processing
Infinite precision ADCs
Practice
Uncoded to LDPC
-7dB in LTE
Finite battery life
1 bit ARQ
50-500 Msps
200 MFLOPs-1B FLOPs
8-16 bits
What else lives in asymptopia?
Backing off from: infinite blocklength
 Recent developments on finite blocklength
 Channel codes (Capacity C for n)
 Source codes (entropy H or rate distortion R(D))
[Ingber, Kochman’11; Kostina, Verdu’11]
[Wang et. Al’11; Kostina, Verdu’12]
Separation not Optimal
Separation not Optimal
Grand Challenges Workshop: CTW Maui
 From the perspective of the cellular industry, the Shannon
bounds evaluated by Slepian are within .5 dB for a packet size of
30 bits or more for the real AWGN channel at 0.5 bits/sym, for
BLER = 1e-4. In this perhaps narrow context there is not much
uncertainty for performance evaluations.
 For cellular and general wireless channels, finite blocklength
bounds for practical fading models are needed and there is very
little work along those lines.
 Even for the AWGN channel the computational effort of
evaluating the Shannon bounds is formidable.
 This indicates a need for accurate approximations, such as those
recently developed based on the idea of channel dispersion.
Diversity vs. Multiplexing Tradeoff
 Use antennas for multiplexing or diversity
What
is
Infinite?
Low Pe
Error Prone
 Diversity/Multiplexing tradeoffs (Zheng/Tse)
lim
log Pe ( SNR )
SNR  
 d
log SNR
lim
SNR  
R(SNR)
log SNR
d (r)  (N t  r)(N r  r)
*
r
Backing off from: infinite SNR
 High SNR Myth: Use some spatial dimensions for
multiplexing and others for diversity
 Reality: Use all spatial dimensions for one or the other*
 Diversity is wasteful of spatial dimensions with HARQ
 Adapt modulation/coding to channel SNR
*Transmit Diversity vs. Spatial Multiplexing in Modern MIMO Systems”, Lozano/Jindal
Diversity-Multiplexing-ARQ Tradeoff
 Suppose we allow ARQ with incremental redundancy
d
16
14
12
L=4
10
8
6
ARQ Window
4
Size L=1
L=2
L=3
2
0
0
1
2
3
4
r
 ARQ is a form of diversity [Caire/El Gamal 2005]
Joint Source/Channel Coding
 Use antennas for multiplexing:
High-Rate
Quantizer
ST Code
High Rate
Decoder
Error Prone
 Use antennas for diversity
Low-Rate
Quantizer
ST Code
High
Diversity
Decoder
Low Pe
How should antennas be used: Depends on end-to-end metric
Joint Source-Channel coding w/MIMO
uR
k
Source
Encoder
s bits
i
Increased rate here
decreases source distortion
Index
Assignment
s bits
p(i)
But permits less
diversity here
Channel
Encoder
MIMO
Channel
A joint design is needed
vj
Source
Decoder
s bits Inverse Index s bits
Assignment
j
p(j)
And maybe higher total distortion
Channel
Decoder
Resulting in more errors
Antenna Assignment vs. SNR
Relaying in wireless networks
Source
Relay
Destination
 Intermediate nodes (relays) in a route help to forward the
packet to its final destination.
 Decode-and-forward (store-and-forward) most common:
 Packet decoded, then re-encoded for transmission
 Removes noise at the expense of complexity
 Amplify-and-forward: relay just amplifies received packet
 Also amplifies noise: works poorly for long routes; low SNR.
 Compress-and-forward: relay compresses received packet
 Used when Source-relay link good, relay-destination link weak
Capacity of the relay channel unknown: only have bounds
Cooperation in Wireless Networks
 Relaying is a simple form of cooperation
 Many more complex ways to cooperate:
 Virtual MIMO , generalized relaying, interference
forwarding, and one-shot/iterative conferencing
 Many theoretical and practice issues:
 Overhead, forming groups, dynamics, full-duplex, synch, …
Generalized Relaying and Interference
Forwarding
TX1
RX1
Y4=X1+X2+X3+Z4
X1
relay
Y3=X1+X2+Z3
TX2

X3= f(Y3)
X2
Analog network coding
Y5=X1+X2+X3+Z5
RX2
Can forward message and/or interference

Relay can forward all or part of the messages


Much room for innovation
Relay can forward interference

To help subtract it out
Beneficial to forward both
interference and message
In fact, it can achieve capacity
P1
S
P3
Ps
D
P2
P4
Maric/Goldsmith’12
•
For large powers Ps, P1, P2, …, analog network
coding (AF) approaches capacity : Asymptopia?
Interference Alignment
 Addresses the number of interference-free signaling
dimensions in an interference channel
 Based on our orthogonal analysis earlier, it would appear
that resources need to be divided evenly, so only 2BT/N
dimensions available
 Jafar and Cadambe showed that by aligning interference,
2BT/2 dimensions are available
 Everyone gets half the cake!
Except at finite SNRs 
Backing off from: infinite SNR
 High SNR Myth: Decode-and-forward equivalent to
amplify-forward, which is optimal at high SNR*
 Noise amplification drawback of AF diminishes at high SNR
 Amplify-forward achieves full degrees of freedom in MIMO systems
(Borade/Zheng/Gallager’07)
 At high-SNR, Amplify-forward is within a constant gap from the capacity
upper bound as the received powers increase (Maric/Goldsmith’07)
 Reality: optimal relaying unknown at most SNRs:
 Amplify-forward highly suboptimal outside high SNR per-node regime,
which is not always the high power or high channel gain regime
 Amplify-forward has unbounded gap from capacity in the high channel
gain regime (Avestimehr/Diggavi/Tse’11)
 Relay strategy should
depend on the worst link
Decode-forward used in practice
Capacity and Feedback
 Capacity under feedback largely unknown




Channels with memory
Finite rate and/or noisy feedback
Multiuser channels
Multihop networks
 ARQ is ubiquitious in practice
 Works well on finite-rate noisy feedback channels
 Reduces end-to-end delay
 Why hasn’t theory met practice when it comes to
feedback?
PtP Memoryless Channels: Perfect Feedback
W W
Encoder
Decoder
Wˆ  W
+
• Shannon
• Feedback does not increase capacity of DMCs
• Schalkwijk-Kailath Scheme for AWGN channels
– Low-complexity linear recursive scheme
– Achieves capacity
– Double exponential decay in error probability
Backing off from: Perfect Feedback
N(0,1)
m Î{1,..., enR }
Xi
Channel
Encoder
Ui
• [Shannon 59]: No Feedback
+
Yi
Decoder
Feedback
Module
Pr {mˆ ¹ m} » e-O(n)
• [Pinsker, Gallager et al.]: Perfect feedback
• Infinite rate/no noise
Pr {mˆ ¹ m} £ exp(- exp(...exp( O(n)...)))
O(n)

• [Kim et. al. 07/10]: Feedback with AWGN
Pr {mˆ ¹ m} » e-O(n)

• [Polyaskiy et. al. 10]: Noiseless feedback reduces
the minimum energy per bit when nR is fixed and n

m
Gaussian Channel with Rate-Limited Feedback
N(0,1)
Xi
Channel
Encoder
Feedback is ratelimited ; no noise
Ui
+
Yi
Decoder
mˆ
Feedback
Module
• Constraints
é n
ù
E ê å | Xi |2 ú £ nP
ë i=1
û
• Objective:
Choose
and f
to maximize the decay rate of
error probability Pe (n, R, RFB , P)
A super-exponential error probability is achievable if and only if R ³ R
FB
•
RFB < R: The error exponent is finite but higher
than no-feedback error exponent
Pe (n, R, RFB , P) £ e-n(ENoFB (R)+RFB +o(1))
•
RFB ³ R: Double exponential error probability
Pe (n, R, RFB , P) £ e
-eO (n )
• RFB ³ LR : L-fold exponential error probability
Pe (n, R, RFB , P) £ exp(- exp(...exp( O(n)...)))
L
Feedback under Energy/Delay Constraint
Forward Energy: Et
S = b1...bm
Otherwise, resend
with energy E
t+1
m-bit
Decoder
}
Pr St ¹ Sˆt = e (EtFB )
Sˆt
Send back Sˆt
with energy EtFB
m-bit
Encoder
Feedback Energy: EtFB
Decoding Delay £ T
Total Energy: å (Et +E tFB ) £ Etot
t=1
m-bit
Decoder
Feedback
Channel
• Constraints
T
}
If Termination
Alarm is received,
report Sˆt as the
decoded message
St
{
Forward
Channel
m-bit
Encoder
If St = S , send
Termination
Alarm
{
Pr Sˆt ¹ S = e (Et )
Objective:
T
FB
Choose { Et , E }t=1 to minimize
the overall probability of error Pe (Etot ,T )
t
Feedback Gain under Energy/Delay Constraint
Depends on the error probability model ε()
• Exponential Error Model: ε(x)=βe-αx
 Applicable when Tx energy dominates
 Feedback gain is high if total energy is large
enough
 No feedback gain for energy budgets below
a threshold
• Super-Exponential Error Model: ε(x)=βe
-
-αx2
Applicable when Tx and coding energy are comparable
No feedback gain for energy budgets above a threshold
Etot
Backing off from: perfect feedback
• Memoryless point-to-point channels:
• Capacity unchanged with perfect feedback
• Simple linear scheme reduces error exponent
(Schalkwijk-Kailath: double exponential)
• Feedback reduces energy consumption
No feedback
Feedback
• Capacity of feedback channels largely unknown
•
•
•
•
Unknown for general channels with memory and perfect feedback
Unknown under finite rate and/or noisy feedback
Unknown in general for multiuser channels
Unknown in general for multihop networks
• ARQ is ubiquitious in practice
• Assumes channel errors
• Works well on finite-rate noisy feedback channels
• Reduces end-to-end delay
How to use feedback in wireless networks?
 Output feedback
 Channel information (CSI)
Noisy/Compressed
 Acknowledgements
 Something else?
Interesting applications to neuroscience
Backing off from: infinite sampling
New Channel
Sampling
Mechanism
(rate fs)
 For a given sampling mechanism (i.e. a “new” channel)



What is the optimal input signal?
What is the tradeoff between capacity and sampling rate?
What known sampling methods lead to highest capacity?
 What is the optimal sampling mechanism?

Among all possible (known and unknown) sampling schemes
Capacity under Sampling w/Prefilter
 (t )
x (t )

h (t )
t  nT s
s (t )
y [n ]
Theorem: Channel capacity
Determined by waterfilling:
suppresses aliasing
“Folded” SNR
filtered by S(f)
Capacity not monotonic in fs
 Consider a “sparse” channel
 Capacity not
monotonic in fs!
Single-branch
sampling fails to
exploit channel
structure
Filter Bank Sampling
t  n ( mT s )
y1 [ n ]
s1 ( t )
 (t )
x (t )
h (t )
t  n ( mT s )
y i [n ]
s i (t )
t  n ( mT s )
s m (t )
y m [n ]
 Theorem: Capacity of the sampled channel using a
bank of m filters with aggregate rate fs
Similar to MIMO; no combining!
Equivalent MIMO Channel Model
H ( f  kf s )
t  n ( mT s )
y1 [ n ]
s1 ( t )
h (t )
S 1 ( f  kf s 

S m ( f  kf s 
 (t )
x (t )
X ( f  kf s 
N ( f  kf s )
t  n ( mT s )
For each f
y i [n ]
s i (t )
H(f)
X(f

t  n ( mT s )
s m (t )
X ( f  kf s 

S1 ( f
Sm ( f
N ( f  kf s )


Ym ( f


S 1 ( f  kf s 
S i ( f  kf s 
S m ( f  kf s 
channel using a bank of m filters with aggregate rate
is
MIMO – Decoupling
Yi ( f

 Theorem 3: The channel capacity of the sampled
Water-filling
over singular
values

S i ( f  kf s 
Si ( f

H ( f  kf s )
y m [n ]
N(f )
Y1 ( f
Pre-whitening
Joint Optimization of Input and Filter Bank
 Selects the m branches with m highest SNR
 Example (Bank of 2 branches)
low
SNR
H ( f  2 kf s )
X ( f  2 kf s 

H ( f  kf s )
highest
SNR
X ( f  kf s 
X(f


H ( f  kf s )
low SNR
X ( f  kf s 

N ( f  kf s )

H(f)
2nd highest
SNR
N ( f  2 kf s ) S ( f  2 kf s )

S ( f  kf s )

N(f )
S( f )

Y1 ( f

Y2 ( f

Capacity monotonic in fs
N ( f  kf s ) S ( f  kf s )

Can we do better?
Sampling with Modulator+Filter (1 or more)
q(t)
 (t )
x(t)
h (t )

p(t)
y [n ]
s (t )
 Theorem:
 Bank of Modulator+FilterSingle Branch  Filter Bank
t  n ( mT s )
q(t)
zzzz
p(t)
zzzz
zz

y1 [ n ]
s1 ( t )
zzzz
s (t )
zzzz
zz
y [n ]
t  n ( mT s )
equals
y i [n ]
s i (t )
t  n ( mT s )
 Theorem
s m (t )
 Optimal among all time-preserving nonuniform
sampling techniques of rate fs
y m [n ]
Backing off from: Infinite processing power
Is Shannon-capacity still a good metric for system design?
Our approach
Power consumption via a network graph
power consumed in nodes and wires
X5
B1
B2
B3
B4
X6
X7
X8
Extends early work of El Gamal et. al.’84 and Thompson’80
Fundamental area-time-performance tradeoffs
 For encoding/decoding “good” codes,
X5
B1
B2
B3
B4
Area occupied by wires
Encoding/decoding clock cycles
 Stay away from capacity!
 Close to capacity we have
 Large chip-area
 More time
 More power
X6
X7
X8
Total power diverges to infinity!
Regular LDPCs closer to bound than capacity-approaching LDPCs!
Need novel code designs with short wires, good performance
Conclusions
 Information theory asympotia has provided much insight and
decades of sublime delight to researchers
 Backing off from infinity required for some problems to gain
insight and fundamental bounds
 New mathematical tools and new ways of applying
conventional tools needed for these problems
 Many interesting applications in finance, biology,
neuroscience, …