Quantum Communication: A real Enigma
Download
Report
Transcript Quantum Communication: A real Enigma
Quantum Information Theory
Patrick Hayden (McGill)
4 August 2005, Canadian Quantum Information Summer School
Overview
Part I:
What is information theory?
What does it have to do with quantum mechanics?
Noise in the quantum mechanical formalism
Entropy, compression, noisy coding and beyond
Density operators, the partial trace, quantum operations
Some quantum information theory highlights
Part II:
Resource inequalities
A skeleton key
Information (Shannon) theory
A practical question:
A mathematico-epistemological question:
How to best make use of a given communications
resource?
How to quantify uncertainty and information?
Shannon:
Solved the first by considering the second.
A mathematical theory of communication [1948]
The
Quantifying uncertainty
Entropy: H(X) = - x p(x) log2 p(x)
Proportional to entropy of statistical physics
Term suggested by von Neumann
(more on him later)
Can arrive at definition axiomatically:
H(X,Y) = H(X) + H(Y) for independent X, Y, etc.
Operational point of view…
Compression
Source of independent copies of X
X
…n
X21X
If X is binary:
0000100111010100010101100101
About nP(X=0) 0’s and nP(X=1) 1’s
{0,1}n: 2n possible strings
2nH(X) typical strings
Can compress n copies of X to
a binary string of length ~nH(X)
Typicality in more detail
Let xn = x1,x2,…,xn with xj 2 X
We say that xn is -typical with respect to p(x)
if
For all a 2 X with p(a)>0,
|1/n N(a|xn) – p(a) | < / |X|
For all a 2 X with p(a) = 0, N(a|xn)=0.
For >0, the probability that a random string
Xn is -typical goes to 1.
If xn is -typical, 2-n[H(X)+]· p(xn) · 2-n[H(X)-]
The number of -typical strings is bounded
above by 2n[H(X)+]
Quantifying information
H(X)
Uncertainty in X
when value of Y
is known
H(X|Y)
H(X,Y)
I(X;Y)
H(Y)
H(Y|X)
H(X|Y) = H(X,Y)-H(Y)
= EYH(X|Y=y)
Information is that which reduces uncertainty
I(X;Y) = H(X) – H(X|Y) = H(X)+H(Y)-H(X,Y)
Sending information
through noisy channels
´
Statistical model of a noisy channel:
m
Encoding
Decoding
m’
Shannon’s noisy coding theorem: In the limit of many uses, the optimal
rate at which Alice can send messages reliably to Bob through is
given by the formula
Data processing inequality
Alice
Bob
( X ,Y )
Y
X
p(z|x)
Z
time
I(X;Y)
I(Z;Y)
I(X;Y) ¸ I(Z;Y)
Y
Optimality in Shannon’s theorem
m
Encoding
Xn
Yn
Decoding
m’
Shannon’s noisy coding theorem: In the limit of many uses, the optimal
rate at which Alice can send messages reliably to Bob through is
given by the formula
Assume there exists a code with rate R and perfect decoding. Let M be
the random variable corresponding to the uniform distribution over messages.
nR = H(M) = I(M;M’) · I(M;Yn) · I(Xn;Yn) · j=1n I(Xj,Yj) · n¢maxp(x) I(X;Y)
Perfect decoding: M=M’
M has nR bits of entropy
Term by term
Some fiddling
Data processing
Shannon theory provides
Practically speaking:
Conceptually speaking:
A holy grail for error-correcting codes
A operationally-motivated way of thinking about
correlations
What’s missing (for a quantum mechanic)?
Features from linear structure:
Entanglement and non-orthogonality
Quantum Shannon Theory
provides
General theory of interconvertibility
between different types of
communications resources: qubits,
cbits, ebits, cobits, sbits…
Relies on a
Major simplifying assumption:
Computation is free
Minor simplifying assumption:
Noise and data have regular structure
Before we get going:
Some unavoidable formalism
We need quantum generalizations of:
Probability distributions (density operators)
Marginal distributions (partial trace)
Noisy channels (quantum operations)
Mixing quantum states:
The density operator
Draw |xi with probability p(x)
Perform a measurement {|0i,|1i}:
Probability of outcome j:
qj = x p(x) |hj|xi |2
= x p(x) tr[|jih j|xihx|]
4
2
1
3
= tr[ |jih j| ],
where
p ( x) x x
i
Outcome probability is linear in
Properties of the density operator
is Hermitian:
is positive semidefinite:
h||i = x p(x) h|xihx|i¸ 0
tr[] = 1:
y = [x p(x) |xihx|]y = x p(x) [|xihx|]y =
tr[] = x p(x) tr[|xihx|] = x p(x) = 1
Ensemble ambiguity:
I/2 = ½[|0ih 0| + |1ih 1|] = ½[|+ih+| + |-ih-|]
The density operator: examples
Which of the following are density operators?
The partial trace
A
{Mk}
AB
Suppose that AB is a density operator
on AB
Alice measures {Mk} on A
Outcome probability is
qk = tr[ (Mk IB) AB]
Define A = trB[AB] = j Bhj|AB|jiB.
Then qk = tr[ Mk A ]
A describes outcome statistics for all
possible experiments by Alice alone
Purification
A
|i
Suppose that A is a density
operator on A
Diagonalize A = i i |iihi|
Let |i = i i1/2 |iiA|iiB
Note that A = trB[]
|i is a purification of
Symmetry: A=A and B have
same non-zero eigenvalues
Quantum (noisy) channels:
Analogs of p(y|x)
What reasonable constraints might such a channel :A! B satisfy?
1) Take density operators to density operators
2) Convex linearity: a mixture of input states should be mapped to
a corresponding mixture of output states
All such maps can, in principle, be realized physically
Must be interpreted very strictly
Require that ( IC)(AC) always be a density operator too
Doesn’t come for free! Let T be the transpose map on A.
If |i = |00iAC + |11iAC, then (T IC)(|ih|) has negative eigenvalues
The resulting set of transformations on density operators are known as
trace-preserving, completely positive maps
Quantum channels: examples
Adjoining ancilla: |0ih0|
Unitary transformations: UUy
Partial trace: AB trB[AB]
That’s it! All channels can be built out of
these operations:
|0i
U
Further examples
The depolarizing channel:
(1-p) + p I/2
The dephasing channel
j hj||ji
|0i
Equivalent to measuring {|ji} then forgetting the outcome
One last thing you should see...
What happens if a measurement is
preceded by a general quantum
operation?
Leads to more general types of
measurements: Positive OperatorValued Measures (forevermore POVM)
{Mk} such that Mk ¸ 0, k Mk = 1
Probability of outcome k is tr[Mk ]
POVM’s:
What are they good for?
Try to distinguish |0i=|0i and |1i = |+i = (|0i+|1i)/21/2
States are non-orthogonal, so projective measurements won’t work.
Let N = 1/(1+1/21/2).
Exercise: M0 = N |1ih1|, M1 = N |-ih-|, M2 = I – M0 – M1 is a POVM
Note:
* Outcome 0 implies |1i
* Outcome 1 implies |0i
* Outcome 2 is inconclusive
Instead of imperfect distinguishability all of the time,
the POVM provides perfect distinguishability some of the time.
Notions of distinguishability
Basic requirement: quantum channels do not increase “distinguishability”
Fidelity
Trace distance
F(,)=[Tr(1/21/2)]2
T(,)=|-|1
F=0 for perfectly distinguishable
F=1 for identical
F(,)=max |h|i|2
F((),()) ¸ F(,)
T=2 for perfectly distinguishable
T=0 for identical
T(,)=2max|p(k=0|)-p(k=0|)|
where max is over measurements {Mk}
T(,) ¸ T((,())
Statements made today hold for both measures
Back to information theory!
Quantifying uncertainty
Let = x p(x) |xihx| be a density operator
von Neumann entropy:
H() = - tr [ log ]
Equal to Shannon entropy of eigenvalues
Analog of a joint random variable:
AB describes a composite system A B
H(A) = H(A) = H( trB AB)
Quantifying uncertainty:
examples
H(|ih|) = 0
H(I/2) = 1
H() = H() + H()
H(I/2n) = n
H(p © (1-p)) =
H(p,1-p) + pH() + (1-p)H()
Compression
Source of independent copies of AB:
No statistical assumptions:
Just quantum mechanics!
…
(aka typical subspace)
A
B
A
B
A
B
Bn
dim(Effective supp of B n ) ~ 2nH(B)
Can compress n copies of B to
a system of ~nH(B) qubits while
preserving correlations with A
[Schumacher, Petz]
The typical subspace
Diagonalize = x p(x) |exihex|
Then n = xn p(xn) |exn ihexn|
The -typical projector t is the projector
onto the span of the |exn ihexn| such that
xn is typical
tr[ n t] ! 1 as n ! 1
Quantifying information
H(A)
Uncertainty in A
when value of B
is known?
H(AB)
H(B)
H(B|A)
H(A|B)
H(A|B)= H(AB)-H(B)
H(A|B) = 0 – 1 = -1
|iAB=|0iA|0iB+|1iA|1iB
B = I/2
Conditional entropy can
be negative!
Quantifying information
H(A)
Uncertainty in A
when value of B
is known?
H(A|B)= H(AB)-H(B)
H(A|B)
H(AB)
I(A;B)
H(B)
H(B|A)
Information is that which reduces uncertainty
I(A;B) = H(A) – H(A|B) = H(A)+H(B)-H(AB) ¸ 0
Sending classical information
through noisy channels
Physical model of a noisy channel:
(Trace-preserving, completely positive map)
m
Encoding
( state)
Decoding
(measurement)
m’
HSW noisy coding theorem: In the limit of many uses, the optimal
rate at which Alice can send messages reliably to Bob through is
given by the (regularization of the) formula
where
Sending classical information
through noisy channels
m
Encoding
( state)
Decoding
(measurement)
2nH(B)
2nH(B|A)
X1,X2,…,Xn
2nH(B|A)
2nH(B|A)
m’
Bn
Sending classical information
through noisy channels
m
Encoding
( state)
Decoding
(measurement)
2nH(B)
2nH(B|A)
X1,X2,…,Xn
Distinguish using well-chosen POVM
2nH(B|A)
2nH(B|A)
m’
Bn
Data processing inequality
(Strong subadditivity)
Alice
AB
I(A;B)
U
I(A;B)
I(A;B) ¸ I(A;B)
Bob
time
Optimality in the HSW theorem
m
Encoding
( state)
m
Decoding
(measurement)
m’
where
Assume there exists a code with rate R with perfect decoding. Let M be
the random variable corresponding to the uniform distribution over messages.
nR = H(M) = I(M;M’) · I(A;B)
Perfect decoding: M=M’
M has nR bits of entropy
Data processing
Sending quantum information
through noisy channels
Physical model of a noisy channel:
(Trace-preserving, completely positive map)
|i 2 Cd Encoding
(TPCP map)
Decoding
(TPCP map)
‘
LSD noisy coding theorem: In the limit of many uses, the optimal
rate at which Alice can reliably send qubits to Bob (1/n log d) through
is given by the (regularization of the) formula
where
Conditional
entropy!
Entanglement and privacy:
More than an analogy
x = x1 x2 … xn
p(y,z|x)
y=y1 y2 … yn
z = z1 z 2 … z n
How to send a private message from Alice to Bob?
Sets of size 2n(I(X;Z)+)
All x
Random 2n(I(X;Y)-) x
Can send private messages at rate I(X;Y)-I(X;Z)
AC93
Entanglement and privacy:
More than an analogy
|xiA’
UA’->BE n
|iBE = U n|xi
How to send a private message from Alice to Bob?
Sets of size 2n(I(X:E)+)
All x
Random 2n(I(X:A)-) x
Can send private messages at rate I(X:A)-I(X:E)
D03
Entanglement and privacy:
More than an analogy
x px1/2|xiA|xiA’
UA’->BE n
x px1/2|xiA|xiBE
How to send a private message from Alice to Bob?
All x
Random 2n(I(X:A)-) x
Sets of size 2n(I(X:E)+)
H(E)=H(AB)
SW97
Can send private messages at rate I(X:A)-I(X:E)=H(A)-H(E) D03
Conclusions: Part I
Information theory can be generalized to
analyze quantum information processing
Yields a rich theory, surprising conceptual
simplicity
Operational approach to thinking about
quantum mechanics:
Compression, data transmission, superdense
coding, subspace transmission, teleportation
Some references:
Part I: Standard textbooks:
* Cover & Thomas, Elements of information theory.
* Nielsen & Chuang, Quantum computation and quantum information.
(and references therein)
* Devetak, The private classical capacity and quantum capacity of a
quantum channel, quant-ph/0304127
Part II: Papers available at arxiv.org:
* Devetak, Harrow & Winter, A family of quantum protocols,
quant-ph/0308044.
* Horodecki, Oppenheim & Winter, Quantum information can be
negative, quant-ph/0505062